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Preface 



The circle is closed. The European Modula-2 Conference was originally launched 
with the goal of increasing the popularity of Modula-2, a programming language 
created by Niklaus Wirth and his team at ETH Zurich as a successor of Pascal. 
For more than a decade, the conference has wandered through Europe, passing 
Bled, Slovenia, in 1987, Loughborough, UK, in 1990, Ulm, Germany, in 1994, and 
Linz, Austria, in 1997. Now, at the beginning of the new millennium, it is back 
at its roots in Zurich, Switzerland. While traveling through space and time, the 
conference has mutated. It has widened its scope and changed its name to Joint 
Modular Languages Conference (JMLC). With an invariant focus, though, on 
modular software construction in teaching, research, and “out there” in industry. 

This topic has never been more important than today, ironically not because 
of insufficient language support but, quite on the contrary, due to a truly con- 
fusing variety of modular concepts offered by modern languages: modules, pack- 
ages, classes, and components, the newest and still controversial trend. “The 
recent notion of component is still very vaguely defined, so vaguely, in fact, that 
it almost seems advisable to ignore it.” (Wirth in his article “Records, Modules, 
Objects, Classes, Components” in honor of Hoare’s retirement in 1999). Clarifi- 
cation is needed. 

The JMLC 2000 featured four distinguished speakers: Ole Lehrmann Madsen, 
Bertrand Meyer, Clemens Szyperski, and Niklaus Wirth and 20 presentations of 
high-quality papers from 9 countries, a careful selection from 54 papers initi- 
ally submitted to the refereeing process. Perspectives comprised parallel and 
distributed computing, components, extensions and applications, compilers, and 
runtime environments. A tutorial prelude taught by international experts shed 
light on commercially available component systems, including COM, JavaBeans, 
CORBA, Component Pascal, and Eiffel, and a special Oberon event provided a 
forum for Oberon developers from industry and academia to present their latest 
creations. 

I would like to thank Springer- Verlag for publishing these proceedings and 
Wolfgang Week for his editorial assistance. My thanks also go to the program 
committee members and to the referees whose competence assured the quality 
of this conference. Last but not least I thank all the helpful people at ETH and 
in particular Eva Ruiz, Patrik Reali, Marco Sanvido, Michela Taufer, and Andre 
Fischer for their work and dedication. 
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The Development of 
Procedural Programming Languages 
Personal Contributions and Perspectives 



Niklaus Wirth 



Abstract. I became involved in the design of a successor of Algol 60 in 
the years 1962-67. The result was Algol-W (66), and later the Algol-style 
Pascal (70), Modula-2 (79), and Oberon (88). In turn, they introduced 
the concepts of data structuring and typing, modular decomposition and 
separate compilation, and object-orientation. In this talk, we summarize 
these developments and recount some of the influences and events that 
determined the design and implementation of these languages. In the 
early 60s, CS was much influenced and concentrated around program- 
ming languages. Various programming paradigms emerged; we focus on 
the procedural branch, directed toward system programming and forming 
the backbone of engineering and data processing methods and tools. I 
conclude with some remarks about how the gap between methods taught 
and methods practiced in software design might be narrowed. 



1 Algol (1960) 

I entered the field of programming in 1961 as a graduate student at UC Ber- 
keley, being interested in computer design as an electronics engineer. I met a 
group of people working on a compiler, a large program converting program text 
into machine code. Particularly intreaguing was the fact that the compiler was 
described in the same language (Neliac) that it accepted as input. This was the 
basis of the bootstrapping technique, the gradual development of language and 
compiler. The program was outrageously complicated by the standards of the 
day, and in effect a single person held the keys to it. For a student seeking a 
thesis topic, this project appeared as ideal with its evident need for disentangling 
the maze. Structure, building rules, a scientific approach was called for. 

In 1960 the Report on Algol 60 had appeared [1]. It offered a promising basis for 
introducing a scientific approach due to its mathematically defined syntax. It is 
well known that in those years much effort went into the development of syntax 
analyzers. I designed my first compiler for a subset of Algol 60 for the IBM 704 
computer, which was ill-suited for the implementation of an important feature of 
Algol: the recursive procedure. This, and more so my work on generalizing Algol 
came to the attention of the IFIP Working Group 2.1, and thus I became involved 
in language design along the line of Algol. What made Algol so interesting? 
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1. Its precise, formal definition of syntax. For the first time, it was possible 
to derive the scaffolding of a translator from the definition of the language 
systematically. The syntax constituted the contract between programmer 
and compiler. 

2. Its close adherence to traditional, mathematical notation. In particular, ex- 
pressions followed longstanding tradition and could be used in their full 
generality independent of their place in the program, for example as indices 
and parameters. 

3. Its block structure, providing the possibility to choose identifiers with limi- 
ted scope independent of identifiers declared in other parts of the program, 
appeared as particularly important for introducing order into programs. 

4. Its procedures with typed parameters provided a flexibility superior to that 
of the then common subroutine. Recursion opened the door to new ways of 
reasoning about programs, and it was a welcome challenge for implementors. 
On the other hand, Algol’s name parameter was, as some suspected, a too 
high-flying mathematical notion to be appropriate for programming tasks. 

But Algol also had its deficiencies, and remedies were necessary if Algol was to 
have a future at the side of Fortran. The argument of efficiency was everpresent, 
and some of Algol’s features were definitely an obstacle in the competition with 
Fortran. The three main shortcomings were 

1. Some ambiguities in the syntax of expressions and statements. At the time 
they gave rise to long debates, and were proof of the benefits of a precise, 
formal definition. 

2. The field of application was too narrow in view of new emerging areas of 
computer usage. Languages like Cobol and Lisp (1962) had opened new 
paradigms and concepts, like records, files, dynamic data structures. 

3. Some of its constructs were unnecessarily general, forbidding an efficient 
implementation. Primarily the For statement, name parameter, and Own 
variables were subjected to scrutiny and heated critizism. 

4. The absence of facilities for input and output. As they were left to indi- 
vidual implementations, the concept of a standard language was severely 
compromised for practical purposes. 



The outcome of WG 2.1’s effort consisted of two languages: Algol W [2], imple- 
mented at Stanford in 1967, and Algol 68, [3] implemented by 1972. 



2 Pascal (1970) 



In search of a language suitable for teaching programming in a structured fashion, 
and for the development of system software, I designed Pascal (1968 -1972), free 
of conflicting demands from a large committee and with a clear purpose in mind 
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[4,5,6]. A Pascal compiler was available by 1970, and beginning in 1972 I used 
it in courses at ETH Ziirich. The language report followed the example of Algol 
closely, and the language can truly be said to retain the “spirit of Algol”. The 
highlights of Pascal were 



1. Simple, structured statements for conditional (If, Case) and repeated execu- 
tion (While, Repeat, For). 

2. The user definability of scalar data types. Apart from standard types Boo- 
lean, Integer, Real, and Char there were enumerations. 

3. The application of structuring principles to data type definitions. Structured 
types were records, arrays, files, sets, and their nesting provided the freedom 
to construct complex data structures. 

4. Strict, static typing. Every constant, variable, function or parameter had a 
type that could be determined by a mere scan of the program text. 

5. Dynamic data structures could be built with the aid of pointers, opening the 
field of list processing. According to Hoare, pointers were bound to objects 
of a fixed type, thus subjected to strict compile-time type checking as well. 

6. Procedures can be used recursively. The controversial name parameter was 
replaced by the reference (VAR) parameter, appropriate for structured ob- 
jects and for passing results. 



In short, Pascal embodied the ideas of Structured Programming. Our first com- 
piler (for the CDC 6000 computer) was programmed in Fortran with the idea to 
hand-translate it to Pascal for further bootstrapping. This turned out to be a fai- 
lure. A second compiler project was launched, starting by programming in Pascal 
itself. The initial bootstrap was preceded by a hand-translation into the low-level 
language Scallop, which was much like the later C. Many other implementation 
efforts were undertaken at various universities. The first was a compiler for the 
ICL 1900 computer at the Queens University at Belfast. But the genuine break- 
through occurred years later with the advent of the micro-computer around 1977. 
With it, computing became available to many at school and at home, and the 
teaching of programming advanced to a subject of “general education”. As the 
market grew, industry became interested. The distribution of compilers was sig- 
nificantly helped by our Pascal-P system: The compiler, written in Pascal itself, 
generated P-code for a virtual stack computer. The recipient’s task was thereby 
reduced to program an interpreter of P-code in the local assembler language. 
The notable Pascal products were UCSD Pascal and Borland Pascal, low-cost 
programming tools affordable by everyone. By 1980 the competition had shif- 
ted from Algol vs. Fortran to Pascal vs. Basic. Pascal subsequently influenced 
other language designs, notably Concurrent Pascal and Mesa. A comprehensive 
account of Pascal’s development and influence has appeared in [7] . 
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3 Modula-2 (1979) 



Of course, Pascal also had its shortcomings, particularly in view of the gro- 
wing power of computing equipment and consequent increase on expectations 
on software. After 10 years of use of Pascal, some features and their importance 
appeared in a different light. For example, one had learnt to program without 
the controversial goto statement. In hindsight, several of the “concessions” made 
to tradition in Pascal were recognized as ill-advised. The handling of input and 
output in Pascal was considered as inadequate and inflexible. It was time for a 
successor to appear. Instead of Pascal-2, it was named Modula-2 [8,9]. 

The goto statement, one of the concessions mentioned, was eliminated. To cater 
for situations that could be foreseen where a goto facility would be missed, the 
collection of repetitive statements was enlarged. In addition to the while, repeat, 
and for statements, a general loop statement containing explicit exit statements 
was offered. So, one concession was replaced by a milder one. In practice the 
loop statement was merely a facility for increased efficiency in rare cases. 

On the syntactic level, the open-ended if, for, and while statements of Pascal, 
giving rise to syntactic ambiguities, were adapted to the general rule that every 
structured statement not only has a unique starting symbol, but also an explicit 
closing symbol (END). 

Pascal Modula 

IF P THEN S IF P THEN S END 

IF P THEN BEGIN SI; S2 END IF P THEN SI; S2 END 

IF P THEN IF Q THEN SI ELSE S2 IF P THEN 

IF q THEN SI ELSE S2 END 

END 

IF P THEN 

IF q THEN SI END 
ELSE S2 
END 



The most significant innovation of Modula with respect to Pascal, however, was 
the module concept. It derived from the concepts of abstract data types and in- 
formation hiding, and was pioneered in the language Mesa, designed at the Xerox 
Palo Alto Research Center in 1976. Whereas in Pascal every program is a single 
piece of text, Modula facilitates the construction of large systems consisting of 
many modules, each a separate text, separately compilable. In contrast to custo- 
mary independent compilation, separate compilation guarantees complete type 
checking across module boundaries at compile-time, an indispensable requisite 
for any truly typed language implementation. It requires that module interfaces 
be specified, and that the compiler must have at its disposal type information 
about module A, whenever compiling any client module B referring to A. This 
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led to the technique of symbol files, invented with the implementation of the 
language Mesa. Modula generalized the concept of modules to nested modules, 
in analogy to nested procedures (procedures declared local to another proce- 
dure). It later turned out that nested modules are rarely of much benefit and 
complicate qualified notations. 

An immediate and beneficial consequence of the module facility was that several 
language issues could be omitted from the language proper if it was possible to 
express them in the language itself in terms of more elementary constructs. In 
this case, they would be programmed and excapsulated in a module, possibly 
declared as a standard component belonging to every implementation. For ex- 
ample, the entire subject of input and output could be delegated to a module in 
a standard library. Another example was the file system, possible because access 
to peripheral devices could be expressed in Modula and be properly encapsu- 
lated in driver modules. This was the first step in a continuing trend towards 
standard program libraries. 

Another significant new facility was the procedure type. It lets procedures be 
assigned to variables, and is a generalization of the (function) procedure used 
as parameter to another procedure. This is one of the two pillars of object- 
oriented programming. Complete parameter type specifications were considered 
as indispensable, as they constituted a remaining trouble spot and pitfall in 
Pascal. 

Modula was designed with the goal to program the entire software for a work- 
station Lilith exclusively in this high level language. As a consequence, certain 
features had to be added, allowing to access special hardware facilities and to 
breach the rigid type system. These additions were made in the firm hope that 
they would be used only in rare instances in the construction of lowest level 
software, such as device drivers and storage allocators. Such parts were to be 
specified as modules safely encapsulating the dangerous statements. It turned 
out to be a misguided illusion! In particular, the so-called type transfer functions 
for breaching the type checker became very heavily misused by inexperienced 
programmers. It became obvious that a language is characterized not so much 
by what it allows to express, but more so by what it prohibits to express. 

All these well-meant additions resulted in Modula growing into a fairly sizeable 
language, inspite of several features being delegated to a module library. Ne- 
vertheless, it had been demonstrated that entire systems could be programmed 
efficiently without resorting to error-prone assembler code, and that static type 
checking was an invaluable asset for large system design. Modula exerted its 
influence on later languages, including Ada. Although only few commercial im- 
plementations emerged, the use of the language spread, and Modula proved its 
suitability also for embedded applications. 




6 



N. Wirth 



Our first Modula compiler was for the DEC PDP-11 computer. Due to its small 
store (32 Kbyte), a 5-pass solution was chosen, with intermediate codes stored 
on the 2 Mbyte disk. It served as tool to implement the compiler for the actual 
target, out Lilith. With its 128 Kbyte store it allowed for a single pass compiler, 
resulting in astounding speed compared to the PDP-11. The output was M-code, 
a byte code directly interpreted by the microprogrammed Lilith hardware. Like 
in the case of Pascal, several implementation efforts were subsequently made 
elsewhere [10]. 



4 Oberon (1988) 



As in the case of Modula, the seed to Oberon was laid during a sabbatical 
leave of the author at Xerox PARC. The goal was to design a modern multi- 
tasking operating system appropriate for powerful workstations. The example 
was Cedar, and as Mesa was considered too baroque, leading to Modula, Cedar 
led to the much simpler Oberon [11,12]. The author, together with J. Gutknecht, 
embarked on the project in early 1987 with the intention to use Modula as our 
implementation language, but quickly decided to extend our quest for simplicity 
to the language too. Applying the principle “Make it as simple as possible, but 
not simpler” , we radically discarded several perhaps convenient, but not essential 
features, such as nested modules, enumeration types, variant records, sets and 
with statements. 



The wave of object-orientation in full swing, we wanted to use these techniques 
wherever they were appropriate (for example in the window (viewer) system, 
and later in the graphics editor), without escalating into making everything an 
object. We rejected the view that a new programming paradigm would have to 
replace everything known and proven, and instead showed that object-oriented 
programming relies on features that mostly are available in existing languages, 
although presented under a new terminology: 



Object-oriented 

Object 

Class 

Method 

Message 

Subclass 



Conventional 

Record, accessed via pointer 
Record type 

Procedure, bound to record 
Call of bound procedure 
Record type extension 



With the exception of the last, all concept were present in Modula, and therefore 
a single extension sufficed to cater for object-oriented programming in Oberon: 
Type extension. In the following example, a type Element (of a list or tree 
representing a graph, and in 00-terminology called an abstract class) is the 
basis of two (concrete) types called Line and Circle: 
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Element = POINTER TO RECORD 

X, y, w, h: INTEGER; (*rectangle covering 

the element’s area*) 

hctndler: PROCEDURE (e: Element; VAR m: Message) 

END ; 

Line = POINTER TO RECORD (Element) END ; (*null extension*) 

Circle = POINTER TO RECORD (Element) 
width: INTEGER 

END 

As a result, the powerful technique of object-oriented programming was obtained 
with a single, quite easy to implement language feature. We note, however, two 
differences to “conventional” 00-languages: 

1. Methods (procedures) are bound to individual records, objects, also called 
class instances, rather than to classes. Oberon is object-centered instead of 
class-centered. 

2. A call of a method m of an object x with parameter p is denoted by x.m(x, 
p) rather than x.m(p), because the object itself must be identified as a pa- 
rameter too. The first x identifies the type to which m belongs, the second 
X the object (variable) to which m is applied. (Typically the two identifiers 
are the same). 

Along with the introduction of type extension and type inclusion, the latter re- 
lating numerical types into a family, went the discarding of various features that 
were deemed of lesser importance in the light of past experience in program- 
ming with Modula. Hence, Oberon may justly be called a trimmed-down version 
of Modula, described in a report of a mere 16 pages, and implementable by a 
small and fast single-pass compiler. Together with the basic system modules, 
it made a radical departure from the traditional concept of batch processing 
system possible. It marks the 

1. departure from batch processing, activating one program after another, one 
at a time, 

2. departure from files stored on disk as the sole interface between one task and 
later ones, 

3. departure from the strict separation between operating systems and user 
programs, and 

4. departure from the notion of a program being the unit of computation. 

Instead, the module represents the unit of separately compilable program text 
with explicit interface, whereas the unit of execution is the command. Modules 
are linked on demand when loaded (without prelinking) . There occur no duplica- 
tes of modules, resulting in a most commendable storage economy. Any exported 
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(parameterless) procedure can serve as a command to be activated by mouse- 
click in any visible program text. This concept represented a break-through in 
the way computers were used: Several tasks may coexist, typically displayed in 
the form of viewers, and the user applies commands to them in any order desired. 
For increased user convenience, pop-up menus, buttons and panels were later ad- 
ded through optional modules, programmable without additional features, the 
basic command facility covering all needs. Thus it became possible to create a 
full operating environment, programmed by two people with two years [13]. It 
gave evidence that a few simple and powerful concepts, distilled to the essen- 
tial, led to an efficient and economical implementation. Like Modula, Oberon 
influenced the design of later languages, notably Java. 

The first Oberon compiler was available by 1986 for the Ceres computer, a 
workstation built around the NS 32000 processor. Subsequently, several imple- 
mentations were completed at ETH Zurich, among others for the PC, the Apple 
Macintosh, and the Sun workstation. Emphasis was given to avoid differences 
between their source languages. Indeed portability was achieved almost to per- 
fection. The basic operating system, including the file and viewer (tiled windows) 
system, text editor and compiler required less than 200 Kbytes of store, a small 
fraction of the space occupied by commercial systems of comparable capability. 
This was considered as a powerful, existential proof that modern design and 
structuring techniques lead to a significant economic advantage. Dramatically 
reduced program size led to dramatically improved reliability [14] . 



5 Concluding Remarks 

The last four decades of research have produced the methods of structured pro- 
gramming, supported by languages such as Algol and Pascal, data typing (Pas- 
cal), information encapsulation and modularization (Modula, Ada), and object 
orientation in the sense of associating specific operations to specific objects in 
heterogeneous data structures (Oberon, Java). Hierarchical structuring is reco- 
gnized as the key to reasoning about program properties and the prerequisite 
for applying the techniques of assertions and loop invariants. Inspite of these re- 
markable achievements in the discipline of programming, we are often confronted 
with software products that give rise to complaints: Lack of reliability, frequent 
breakdowns, horrors of complexity, incomplete and inconsistent descriptions, and 
similar blessings. What has happened? 

Today’s software industry is still largely working with methods and languages 
that were acceptable 30 years ago, but are blatantly outdated and inadequate for 
designing complex software today. In many places, this state of affairs is recogni- 
zed; but a remedy is not in sight. There exists a vast amount of legacy code, and 
customers demand and depend on new software being compatible with the past, 
including its mistakes. This request makes new designs impossible, because the 
specifications of the old designs lie in their code, the inscrutable, intertwined. 
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unstructured code fraught with tricky constructions and hastily made fixups. In 
particular, complex programs, such as operating systems, compilers, data base 
and CAD systems are subjected to this plight. More isolated systems, such as 
embedded control and sensing applications, are somewhat less afflicted. Desi- 
gners had been more anxious to avoid complexity that escapes their mastery, 
and customers are more likely to give security and reliability a higher priority 
than convenience and generality. 

It is indeed difficult to see a way to break the vicious circle where industry 
delivers what customers want, and customers buy what industry offers. One 
possibility might be to introduce the notion of liability for malfunction. It rests 
on the assumption that customers are willing to pay for higher quality. As the 
price increase would be significant, there is little hope that liability concerns will 
enter the software market at large. However, in a modern world where everything 
has its price tag, also quality will not come for free. 

It is equally difficult to envisage the breaking of another vicious circle, namely 
the one in which industry practices the methods taught at schools, and schools 
teach what is currently practiced in industry. Radical shifts of paradigms ra- 
rely occur. More likely is progress in small steps, preferably steps that are not 
noticeable individually. Industry is gradually becoming aware that old methods 
simply will not do much longer, and hence welcomes small steps. Examples are 
the use of simple annotations and assertions in programs, even if they are not 
checked automatically, or the use of the constructs of a higher level language, 
even if coding must be done with C or C-|— I- and no compiler checks the rules, or 
the gradual enforcement of the rule of module ownership, although no compiler 
checks against violations. 

This brings us to the role of universities. It is here that small cuts into vicious 
circles must be prepared, where modern methods, discipline, and languages ex- 
pressing them appropriately must be taught, where students not only learn to 
program, but to design, to structure their programs. This is infinitely more im- 
portant than teaching students the rules of C and C-|— k, although students de- 
mand this, because it secures them a good income after graduation. Universities 
must return to take their responsibilities seriously, which demand that universi- 
ties be leaders instead of followers. Unfortunately, current trends do not favour 
such a return. Universities too must look for income, and it is much more rea- 
dily available for research, useful or esoteric, than for “teaching the right stuff’ . 
This is deplorable in a field where the methods available are far ahead of the 
methods practiced, and where therefore the promulgation of the existing new 
ideas is much more important than the creation of even more new ideas. 
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I hope I have clearly expressed my opinion that programming, programming 
style, programming discipline, and therewith programming languages are still 
not merely one of many issue in computing science, but pillars. Our society 
depends to an ever increasing degree on computing techniques. Let those who are 
tomorrow’s designers receive an excellent education, and let them enjoy excellent 
tools instead of crutches. The mentioned vicious circles will be broken, when 
universities are consciously striving to be leaders rather than followers, as was 
the case some 40 years ago. 
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Abstract. Most object-oriented languages offer a limited number of invocation 
semantics. At best, they define a default mode of synchronous invocation, plus 
some keywords to express additional semantic attributes, e.g. synchronisation. The 
very few approaches that offer rich libraries of invocation abstractions usually 
introduce significant overhead and do not support the composition of those 
abstractions. 

This paper describes a pragmatic approach for abstracting invocation semantics 
with emphasise on remote invocations. Invocation semantics, such as synchronous, 
asynchronous, remote, transactional or replicated, are all considered first class 
citizens. Using an elegant combination of the Strategy and Decorator design 
patterns, we suggest an effective way to compose various invocation semantics. 

We completely separate the class definition from the invocation semantics of its 
methods and we go a step further towards full polymorphism: the invocation of the 
same method can have different semantics on two objects of the same class. The 
very same invocation on a given object may even vary according to the client 
performing the invocation. To reduce the overhead induced by the flexibility 
underlying our approach, we rely on just-in-time stub generation techniques. 

Keywords: Distributed objects, programming languages, middleware, design 
patterns, abstractions, remote invocations 

Technical areas: Adaptive Communication Systems, Distributed Systems 
Architecture 

1 Introduction 

Object-oriented languages usually offer rich libraries for sequential programming, 
together with various abstractions for expressing and controlling concurrency. Most 
modern object-oriented languages offer, for example, several forms of collection 
interfaces, e.g., sets, bags, arrays, etc. They usually also support processes and 
semaphores or some alternative built-in concurrency constructs. Composing various 
abstractions to build effective frameworks and applications has been an active area of 
investigation and has given birth to some interesting compilations of good practices [10]. 

In the last decade, many object-oriented languages have been extended towards 
distribution, and new object-oriented languages have been designed from scratch with 
distribution in mind [1, 2, 3, 5, 8, 17 19]. The objective is to provide the developers of 
distributed applications with higher level abstractions than what operating systems usually 
offer. For instance the CORBA approach to distributed programming goes along these 
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lines [15]. It replaces the low-level socket-based communication model with a higher 
level construct: remote method invocation [16]. The latter can be viewed as a reproduction 
of the RPC abstraction in an object-oriented context. It basically aims at providing the 
programmer with the illusion of a centralized (i.e., non-distributed) context. For several 
applications, having distributed entities interact through (synchronous) remote method 
invocations is very acceptable. For many others, it is not. The reason is mainly twofold. 
First, remote invocations mean encoding (marshalling) and decoding (unmarshalling), 
behind the scenes, the name of an operation and its arguments into some standard format 
that could be shipped over the wire. Added to the latency of remote communication, the 
marshalling and demarshalling time is a strict overhead with respect to the time taken to 
perform a local invocation. For many time-critical applications, pretending that the 
overhead is negligible is simply impossible. Some form of asynchronous communication 
model is more adequate. Second, and in particular for large scale applications, the crash of 
a remote host and communication failures can simply not be ignored. More complex 
replication and transactional schemes are needed. 

In short, remote method invocation has its own most suitable domain of applications 
but can hardly be considered as the single communication construct to fit all kinds of 
applications. The same reasoning applies to alternative communication paradigms such as 
asynchronous or atomic invocation [14, 16, 19]. In fact, even the parameter passing mode 
can be subject to variations and there is no best scheme (deep vs. shallow copy) that fits 
all applications. 

We explore a fully object-oriented approach where invocation semantics are 
themselves viewed as first class abstractions [12]. Just like there are extensible libraries of 
data manipulation abstractions, we suggest an approach where various forms of 
invocation semantics are classified within a (interface) class library. The programmer can 
use various classes of that library to express different models of communication within the 
same application. Following the Strategy design pattern [9, 10], we de-couple algorithms 
from protocols, and hence promote the reuse of the code underlying the invocations. We 
also use the Decorator design pattern [10] to help composing several semantics for the 
same invocation (e.g., replicated plus asynchronous plus passing by reference). Specific 
semantics are viewed as filters that decorate the invocations. 

Our approach shares some similarities with so-called reflective distributed object- 
oriented languages (e.g., [1, 19]). The underlying goal is the same: promoting system 
level abstractions (e.g., remote communication) to the rank of first class citizens. 
However, reflective approaches mainly focus on building Meta-Object Protocols (MOPs) 
that transparently intercept object interaction. A MOP enables the programmer to plug 
together specialized invocation semantics. We ignore transparency issues and address the 
question of how to represent invocation abstractions in such a way that they can be 
composed in a flexible and effective manner. It is pragmatic in the sense that the 
programmer is very aware of the communication semantics that are used in a given 
program: choosing the right communication model is not the task of a specialized meta- 
programmer. This enables us to use a very effective technique to reduce the overhead of 
our flexible approach: just-in-time stub generation [11]. We describe some performance 
measures obtained from our Oberon implementation and we point out the very fact that 
the cost of our flexible yet pragmatic approach is negligible. 
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The rest of the paper is organized as follows. Section 2 gives an overview of how to 
program with first-class invocations in our approach. Section 3 describes the way the 
invocation library is organized and the rational behind our combination of the Strategy 
and Decorator design patterns. Section 4 discusses implementation issues and Section 5 
the performance of our just-in-time stub generation technique. Finally, Section 6 
summarizes the contribution of the paper and gives some concluding remarks about the 
general applicability of the approach. 

2 Programming with Composable Message Semantics 

This section gives an overview of our library with the help of the “Dining Philosophers” 
problem [6]. This problem is well suited to show the advantages of our framework. 

The message semantics of common object-oriented environments are fixed. The 
system either enforces one fixed semantics, or allows the choice between a small fixed set 
of semantics, each associated with some pre-defined keyword. Our invocation 
abstractions offer an open way to create arbitrarily many new kinds of semantics. For 
every method one can define the semantics that handles the invocation of that method: 
this is done by creating an instance of the invocation class and assigning it to the desired 
method. While doing so, two sets of semantics must be supplied: caller and callee side 
invocation semantics (see Figure 1). 



caller — invocation 




Figure 1 : Overview of intercepted invocation 

The chosen client-side semantic is executed on the host of the stub object while the server 
side semantic executes on the host of the real object. This distinction has two advantages. 
First, the programmer can decide, individually for each part of the invocation semantics, 
where it should be executed, i.e., on the client or on the server. Second, when several 
hosts have a stub of the same server object, a client-based modification is executed only 
when the corresponding stub object is invoked. A server side modification is executed 
whenever a method is invoked on the real server object, i.e., regardless of the stub that 
initiated the invocation. 

Our first approach to the dining philosophers is a straight forward implementation 
that ignores all synchronisation concerns and is - of course - not correct. 

MODULE Philosophers; 

IMPORT Threads;TYPE Eater* = POINTER TO EaterDesc; EaterDesc* = RECORD 
(Threads .ThreadDesc) END; PROCEDURE (me; Eater) Think*; (* me denotes the receiver of the 

method Think *)END Think;PROCEDURE (me: Eater) Eat*;END Eat; 
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PROCEDURE Start;VAR t: Threads. Thread; self: Eater;BEGIN t := Threads. ActiveThread(); self := 

t(Eater); LOOP self.Think; self.Eat ENDEND Start; PROCEDURE Dinner*;VAR 1: INTEGER; 
p: Eater;BEGIN FOR i := 0 TO 4 DO NEW(p); Threads. Start(p, Start, 10000) ENDEND 

Dinner;END Philosophers. 

To correct this faulty behaviour we have to insert synchronization code. The straight 
forward solution is to insert synchronisation mechanisms, e.g. a semaphore for every fork, 
before and after the invocation of Eat. However, this intermixes application code with 
code responsible to guarantee synchronisation constraints. 



caller 



stub 

code skeleton 




Figure 2: Semantics for Eat using DecObjects 

Using our composable message semantics (CMS) we circumvent this mix-up. We use the 
module DecObjects to modify the invocation semantics of the method Eat (see Figure 2). 

PROCEDURE Dinner*; 

VAR 

i, res: INTEGER; t: Eater; first, second: Locks. Lock; c: Invocations. Ctass; m: Invocations.Method; 
forks: ARRAY 5 OF Locks. Semaphore; left, right: INTEGER; 

BEGIN 

FOR i := 0 TO 4 DO 

NEW(forks[i]); forks[i].Init(I) 

END; 

FOR i := 0 TO 4 DO 
NEW(t); 

c := Invocations. GetClass(t); 
m := c.GetMethodC'Eat"); 

IF i#4 THEN left:=i; right:=i+l ELSE left:=0; right:=4 END; 
first := Locks. New(forks[left]); second := Locks. New(forks[right]); 
first.next := second; second. next := DecObjects. InvocationQ; 
m.SetCallerlnvocation(first); 

DecObjects. SetSemanticsft, c, phils[i], res); 

END; 

FOR i := 0 TO 4 DO 

Threads. Start(phils[i], Start, 10000) 

END 
END Dinner; 
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As we changed only the initialising procedure Dinner, we included only this procedure. 
First we initialise the necessary forks, i.e. forks. Afterwards we initialise our philosophers 
in a loop. The procedure GetClass returns a Class object for the passed object instance. 
This Class object contains all information about all methods (including inherited ones). In 
particular, it contains the necessary information to change the invocation semantics. One 
can scan this information and change it according to the current necessities. We assign the 
new caller-side invocation semantic first. Afterwards, we assign the modified semantic 
information to the corresponding philosopher phils[i] by calling SetSemantics. 

In first we define the semantics to be used for the method Eat of the different 
philosophers. The semantics consist of two locking filters and the invocation abstraction 
Directinvocation (see Figure 2), which is part of the CMS framework and actually 
invokes the method. A locking filter first acquires its assigned semaphore (a fork in this 
example) and then passes the invocation on. The above example shows very well the 
separation of application and synchronisation code. The code necessary for the 
synchronisation is concentrated within the module body. The actual application code stays 
as it was before we introduced synchronisation into the example. 




Figure 3: Semantics for Eat using DObjects 

Using the distributed objects package we can increase the flexibility even further: we 
allow the philosophers to reside on different hosts, i.e. only when calling Eat they access 
the table in order to ensure correctness. We decorate Eat with an appropriate semantic 
(see Figure 3). On the client we choose Syncinvocation which executes the invocation as a 
synchronous remote invocation. On the server we choose the same semantics as we 
already used for our previous version. We omit to define a semantic for Think. However, 
we are free to choose whether it should execute on the server or on the client. We marked 
the code that we had to modify in order to distribute our dining philosophers 
implementation. We split the program into a server and a client part. The server Serverinit 
prepares the philosopher objects, assigns the correct semantics and exports them on the 
network. The client Client imports the remote object and invokes its method without 
caring about the used semantics. 

PROCEDURE Serverinit*; 

VAR 

i, res: INTEGER; 1, r, first: Lock.Lock; c: Invocations. Class; m: Invocations.Method; 

forks: ARRAY 5 OF Lock.Semaphore; 

left, right: INTEGER; 

si: Invocations. Invocation; 

name: ARRAY 6 OF CHAR; 







16 



M. Hof 



BEGIN 

FOR i := 0 TO 4 DO NEW(forks[i]); forks[i].Init(l) END; 

si := DObjects.SyncInvocationO; 

FOR i := 0 TO 4 DO 
NEW(phils[i]); 

c := Invocations. GetClass(phils[i]); 
m := c.GetMethodC'Eat"); 

IF i#4 THEN left:=i; right:=i+I ELSE left:=0; right:=4 END; 

1 := Lock.New(forks[left]); r := Lock.New(forks[right); 

l. next := r; r.next := DecObjects.Invocation(); 

m. SetCalleelnvocation(I); 
m.SetCallerlnvocation(si); 

name := "PhilX"; name[4] := CHR(i+ORD(D’)); 
DObjects.Export(phils[i], Network.DefaultHost(), name, c, res) 

END 

END Serverinit; 

PROCEDURE Client*; 

VAR 

name: ARRAY 6 OF CHAR; res: INTEGER; p: Philosophers. Eater; 
BEGIN 

In.Open; In.Int(i); name := "PhilX"; name[4] := CHR(i+ORD(D’)); 
DObjects.Import(Network.ThisHost("...", name, p, res); 

Threads. Start(p, Start, 10000) 

END Client; 



3 Composing Invocation Abstractions 

This section discusses the extensibility of our approach. We first describe our 
classification model and then how the invocation models can be composed and extended. 

3.1 Combining the Decorator and Strategy Design Patterns 

We use a flat class hierarchy, which allows us to change the behaviour combinations 
dynamically. To do so, we combined two design patterns: the Decorator design pattern 
and the Strategy design pattern [10]. The Decorator pattern is basically used for the static 
composition and the Strategy pattern for the dynamic one (see Figure 4). 




Figure 4: Type hierarchy of the CMS framework 

An invocation abstraction can be decorated with arbitrarily many additional decorators. 
We distinguish between extensions of Invocation and extensions of InvocationF liter. 
Extensions of InvocationFilter are just decorators. We call them filters. They may be 
cascaded in an arbitrary order (some actual implementations may impose required 
ordering). They extend the semantics of the invocation, i.e., add functionality, but do not 
implement the invocation themselves. They forward the invocation to their decorated 
object. On the other hand, extensions of Invocation actually invoke the chosen method. 
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These are the actual abstractions. An actually used composition of a set of filters and one 
abstraction is called a semantics. 

This structure is flexible in two directions. First, one can add arbitrary new filters, 
e.g., for logging, visualization, synchronization, authentication, etc. Second, one can add 
new invocation abstractions, e.g., best-effort, at-most-once, delayed, etc. The structuring 
also promotes arbitrary combinations of the different invocation abstractions and filters. 

We use the Strategy design pattern in our system at run-time. All currently used 
semantics are held in an array where they are stored at a known index. The stubs only see 
an array of strategies on how a method may be invoked (Section 4 gives more details on 
this aspect). 

3.2 Composition 

The current library of invocation abstractions and filters is relatively small. Flowever, the 
principles governing composition stay the same, when new abstractions or filters are 
implemented and added to the framework. Whenever one modifies the model with which 
a method is invoked, two semantics need to be supplied: one for the client, and one for the 
server. They will execute on the designated host. 



Invocation 


next^ 


Invocation 




Invocation 


next^ 


Invocation 


Filter 1 




Filter 2 




Filter n 




Abstraction 



Figure 5: Chain of invocation filters 

A semantics consists of exactly one invocation abstraction and of arbitrarily many 
invocation filters (see Figure 5). A filter never handles an invocation directly, but after 
some filter specific work, forwards it to its decorated object. Only the invocation 
abstraction actually executes the invocation. Server side abstractions actually start the 
execution of the method on the real object, i.e., it is located were the actual work occurs. 
Client side abstractions are responsible for transporting the invocation to the real object, 
in order to trigger the execution of the server side invocation semantics. Typically, this 
includes some kind of network traffic. 

3.3 Extension 

To extend the framework with new kinds of invocations, one has to distinguish between 
abstractions and filters. Writing a filter is much simpler. It requires implementing a new 
type that extends InvocationFilterDesc, and overriding the method Invoke. Consider the 
following example: 

MyFilter = POINTER TO MyFilterDesc; 

MyFilterDesc = RECORD (Invocations. InvocationFilterDesc) 

END; 

PROCEDURE (inv: MyFilter) Invoke (obj; PTR; id: LONGINT; s: Stream): Stream; 

BEGIN 

SomePreprocessing (obj, id, s); 
result := inv.next.Invoke (obj, id, s); 

SomePostprocessing (obj, id, s, result); 

RETURN result 
END Invoke; 
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Invoke receives the receiver object obj, an identifier id that denotes the called procedure, 
and a stream s that contains the marshaled parameters (The exact mode used for 
marshalling may also be changed along with the invocation semantic. For further 
information see [12]). As a return value, it supplies the stream containing the marshaled 
result of the invocation. Before and after the invocation is forwarded to the next 
abstraction, the filter can do its specific work. With the help of meta programming 
facilities, it can even scan the parameter stream and react to its contents. 

4 Implementation Issues 

Intercepting ordinary method invocations with minimal overhead is a key issue underlying 
our approach. In our case, this is done by cloning the actual object on which the methods 
are invoked. Methods invoked on the actual object are still handled as in any traditional 
object-oriented language. Those invoked on the clone are intercepted and handled as 
specified by the class information. The interception is done with the help of automatically 
generated code pieces: stub and skeleton (see Figure 1). In this section we describe the 
generation of the stub and the skeleton with their adaptation to the chosen semantics. 

The code generation mechanism is generic. Flowever, one normally will not access it 
directly but only indirectly through another API (e.g., distributed object system). Methods 
invoked on the clone (stub object) are handled by the generated stub code. The stub has 
the same interface as the actual method and replaces it transparently. It marshals the 
parameter and activates the appropriate client invocation semantics. Simultaneously, the 
skeleton code is generated. The skeleton is the counterpart of the stub on the server. It 
unmarshals the parameters and calls the actual method. 

In typical applications, one does not have to care about these low-level abstractions, 
but one will see a higher level one, e.g. the distributed objects package. In this case, the 
Import and Export procedures automatically generate the necessary code. The application 
programmer sees nothing of these details. 

One of our main goals was - from the beginning - that the delay, introduced by the 
increased flexibility of arbitrary invocation semantics, is kept as small as possible. In 
particular, this means that the delay introduced is constant with respect to the number of 
managed objects and the number of different semantics. Achieving that goal requires the 
stub code to know exactly where its semantics is. This, and the adaptable parameter 
passing modes, force the code generation to be done after the semantics have been 
defined. It also requires that the stub and skeleton code is generated at run-time: hence our 
just-in-time stub generation approach. 

The actually assigned semantics are stored in an array. All used invocation semantics 
are in this array and are identified by their index. Each newly defined semantics is 
assigned a slot within this array, i.e., it receives a unique number. It is possible to use this 
knowledge and access the correct semantics directly through an array access. As this 
index is - at stub compilation time - a constant, it is even possible for the compiler to 
calculate the offset during compilation. This reduces the actual overhead on the client side 
to a negligible size. Consider the following class Object which defines a single method 
Dec. 

Object = POINTER TO ObjectDesc; 

ObjectDesc = RECORD 

PROCEDURE (o: Object) Dec (n; INTEGER) : INTEGER; 



END; 
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The generated code below consists of two procedures: one for the stub object and one for 
the skeleton. The generated procedure Dec of the stub object is used to intercept 
invocations. It mostly deals with marshalling and unmarshalling of parameters. The actual 
call of the assigned semantics (bold line in the following listing) is a simple method 
invocation. In this example, the client side semantics is stored at index 10. The index is 
constant and therefore the compiler can calculate the offset into the array, which reduces 
the delay even further. The corresponding skeleton DecSkeleton has the inverse task. It 
unmarshals the parameters, invokes the actual method, and marshals the output values. 

PROCEDURE (o: Object) Dec (n: INTEGER): INTEGER; 

VAR lin: Linearizer; s: .Stream; retVal: INTEGER; 

BEGIN 

lin := NewWriterO; 
lin.Integer(n); 
s := lin.StreamQ; 

s := Invocations.inv[10].Invoke(o, 2, s); 

lin := NewReader(s); 
lin.Integer(retVal); 

RETURN retVal 
END Dec; 

PROCEDURE DecSkeleton (receiver: Object; VAR stream: Stream); 

VAR lin: Linearizer; retVal, n: INTEGER; 

BEGIN 

lin := NewReader(stream); 

lin.Integer(n); 

retVal := receiver.Dec(n); 

lin := NewWriterO; 

lin. Integer(ret V al) ; 

stream := lin.Stream() 

END DecSkeleton; 

The code, as shown above, is actually never generated. Our implementation directly 
generates compiler intermediate code, i.e., it builds an abstract syntax tree. This approach 
is portable, as we use a portable intermediate language with suitable back-ends for 
different platforms. This mechanism allows extremely fast code generation (as depicted 
by our measurements below). The code generator needs the meta information of the 
intercepted class to generate the necessary code pieces. This hence does not introduce any 
loss in speed with respect to an ad hoc approach where invocations do all have the same 
semantics. Even the amount of data transferred from client to server has not been 
increased by introducing flexible invocation semantics. The only overhead is actually the 
one introduced by the distributed lookup technique, as we discuss below. 

5 Performance Measurements 

In this section we describe some performance figures we obtained from our Oberon 
prototype. Although it has not been optimised, our prototype enables us to draw 
interesting conclusions about the cost of our flexible approach. Basically, we compare that 
cost with the cost of an ad hoc approach derived from our prototype, which uses always 
synchronous remote invocation. 

5.1 Overview 

As our test environment, we used a Pentium 200 MHz computer running Windows NT 
Version 4.0 (Build: 1381: Service Pack 3). To measure the time, we used a special register 
of the Intel architecture that always contains the cycles since the latest reset. This allows 
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for extremely accurate timing. To cope with differences introduced by the garbage 
collector, we run all measurements three times in sequence with 100, 1000, and 10000 
invocations. We also repeated all measurements ten times omitting the fastest and the 
slowest measurements. To have our measurements independent of the installed network 
and its current load, we used local TCP loop-back. In other words, client and server were 
running on the same machine and communicated via the TCP loop-back host (127.0.0.1). 

Our marshalling mechanism is neither time nor memory optimised, i.e., it allocates a 
considerable amount of memory. This results in several runs of the garbage collector 
during the measurement. If not noted otherwise, the time spent collecting garbage is 
included in the measured intervals. 

Our measurements use different situations to compare our flexible approach with an 
ad hoc approach. This ad hoc approach is a modified version of our flexible approach. It is 
modified in two ways. First, all parts allowing for arbitrary semantics have been removed, 
i.e., the ad hoc version always uses synchronous remote invocation. Second, we optimized 
the ad hoc version wherever a simpler solution was possible by the introduction of a fixed 
invocation semantics. The main differences between the flexible and the ad hoc approach 
can be seen in the following source sniplets. 



fixed semantics server: 

s := Objects.Call(obj, mID, s); 


fixed semantics client stub: 

s := DObjects.SendCall(obj, mID, s); 


arbitrary semantics server: 


arbitrary semantics client stub: 


WHILE m.id # mID DO 




m := m.next 


s := Invocations. inv[10].Invoke(obj, mID, s); 


END; 




inv := m.GetServerInvocationO; 




s := inv.Invoke(obi, id, s) 





As one can see, on both server and client side, additional work is done. The most 
expensive operation is the server-side lookup for the correct server-side semantics. This 
lookup is included in our measurements. 

TYPE 

Object* = POINTER TO ObjectDesc; 

ObjectDesc* = RECORD 
ctr: LONGINT 

END; 

PROCEDURE (o: Object) Next* ; 

END Next; 

We used the above class definition as our test environment. This definition declares a 
class Object with one method Next that has an empty body. Depending on the test we 
added parameters and/or a return value to the interface of Next. The server simply 
allocates an instance o of this class and exports it on the default host under the name 
“TestObject”. 

NEW(o); o.ctr := 0; 

DObjects.Export (o, Network.DefaultHost(), "TestObject", NIL, err); 
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In the standard test case we use no invocation meta information, i.e. the default semantics 
(NIL), i.e., we use synchronous remote invocation. Depending on the test we add here 
customized meta information. 

DObjects. Import (Network.ThisHost("TestServer"), "TestObject", o, err); 
o.Next; 



(* start measurement *) 

FOR i 1 TO n DO 
o.Next 

END; 

(* stop measurement*) 

The client implements mainly a loop that repeatedly calls the method Next. The loop 
repeats n times depending on the current test. Our implementation needs some additional 
time to build data structures whenever an object receives its first invocation. Therefore we 
call Next once before we start the actual measurement (the code generation needs less than 
2 milliseconds and can be further reduced by maintaining a persistent pool of already 
generated code snippets). 

5.2 Measurements 

On our test configuration, it takes about 990 microseconds to echo a TCP packet 
containing one byte of user data from user space to user space. This includes the overhead 
introduced by our generic network interface as well as the additional TCP layer 
introduced by the Oberon environment. 

To compare the dependency of the invocation duration on the amount of passed 
parameters, we measured three times with different amounts of parameters. We made the 
measurements with the following interfaces: 

PROCEDURE (o: Object) Next* ; 

PROCEDURE (o: Object) Next (i: LONGINT) : LONGINT; 

PROCEDURE (o: Object) Next (VAR but: ARRAY 1024 OE CHAR); 

In every case, the same amount of data is transferred back and forth over the network (0, 
4, 1024). The results are shown below in milliseconds per invocation (see Figure 6). 



No Parameters 



6 j 

5 ■ 
4 ■ 




repetitions 



in/out of 4 bytes 



in/out of 1024 bytes 




repetitions 




repetitions 



Figure 6: Measurements depending on parameter size 
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Next we tried to determine the additional overhead introduced by other invocation 
semantics. To test this we decorated the invocation of the method Next with either a 
server-side and/or a client-side filter (see Figure 7). Both filters have an empty body. We 
took the first version of Next without parameters. These measurements were, of course, 
only done on the version with arbitrary invocation semantics. 

Measurements with different semantics 



100 1000 10000 
repetitions 



3 T 
2 | - 

1,5 - 
1 - 
0,5- 



□ no 

□ server 

□ client 

□ server and client 



Figure 7: Measurements depending on additional filters 

Finally, we split up the invocation time to determine where exactly the time is spent (see 
Figure 8). These measurements were only done on the version with arbitrary invocation 
semantics. We split the time into five parts: transport, server-side semantics, client-side 
semantics, stub, and skeleton, and compare in percentages. 
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Figure 8: Measurements split into smaller units 



5.3 Discussion 

The figures in section 5.2 show the extremely small overhead introduced by our arbitrary 
invocation semantics. The parameter size does not influence the speed penalty of using 
arbitrary semantics (see Figure 6). An invocation with no parameters needs around two 
milliseconds. Regardless of whether fixed or arbitrary semantics are used. Introducing 
parameters does not change this behaviour. Only the overall performance degrades in 
dependence of the actual parameter size. A surprising behaviour can be seen when the 
parameter size is big (1024). The performance is much worse if we repeat the invocation 
more often(see Figure 6). This is probably a consequence of the garbage collector, which 
has to run more often. 

The second measurement was made to show the performance loss by introducing 
invocation filters (see Figure 7). It shows that it is quite cheap to introduce another 
invocation filter. This was to be expected, as an empty filter introduces nothing but an 
additional method invocation. The measured differences are actually almost too small to 
get meaningful results. 
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Finally, we split the invocations in five parts: transport, server-side semantics, client- 
side semantics, stub code, and skeleton code. We compared the different parts on how 
many percent of the whole invocation is spent to execute them. The results show that 
server and client semantics never use more than half a percent of the time spent for the 
invocation (see Figure 8). The main part of the invocation time is spent in transit from the 
caller to the callee or vice versa. Optimisation efforts will find many things to ameliorate. 
Another potential for further optimisations is in the marshalling done in the skeleton and 
the stub. But however good one optimises these parts, the mechanism of arbitrary 
invocation semantics is never the real bottleneck for remote method invocations. 

With all these comparisons, one has to remember that the CMS framework is tuned 
for distributed applications. For strictly local applications one has to implement another 
scheme, which, e.g. , does not marshal parameters. 

6 Conclusions 

It is tempting to assume that all distributed interactions of a given application can be 
performed using just one (synchronous remote) method invocation abstraction, just like in 
a centralized system. In practice, this uniformity usually turns out to be restricting and 
penalizing and the myth of "distribution transparency" is very misleading. It is now 
relatively well accepted that the "one size fits all" principle does not apply to the context 
of distributed object interactions [18]. Most uniform approaches to object-oriented 
distributed programming have recently considered extensions to their original model in 
order to offer a more flexible choice of interaction modes. For example, the OMG is in the 
process of standardizing a messaging service to complement the original CORBA model 
with various asynchronous modes of interaction [15]. 

Several object-oriented languages offered, from scratch, various modes of 
communication. Each is typically identified by a keyword and corresponds to a well 
defined semantics. For example, the early ABCL language supported several keywords to 
express various forms of asynchrony, e.g., one-way invocation, asynchronous with future, 
etc [19]. Similarly, the KAROS language supported several keywords to attach various 
degrees of atomicity with invocations, e.g., nested transaction, independent transaction, 
etc. [7]. The major limitation of these approaches is that one can never predict the need of 
the developer, and coming with a new form of interaction means changing the language. 
Other approaches, e.g. composition filters [4], focus on the possibility to define arbitrary 
semantics. However, they use a class-centric approach, that is not suited for distribution 
and one pays a large penalty regarding the necessary execution time. 

Our approach is different in that we focus on how to structure a extendable library of 
interaction abstractions in a pragmatic and effective way. By combining some good 
practices in classifying abstractions using the Decorator and Strategy design patterns, we 
suggest a composition model that promotes the reuse of invocation implementations (we 
go a step further than in [9]). We reduce the overhead, inherent to the flexibility of our 
approach, by giving up transparency and relying on just-in-time stub generation. 

We illustrated our approach by building a distributed extension of the Oberon system 
and we demonstrated it on a simple example. The very same approach could be applied to 
other languages and environments. The actual requirements are easily fulfilled. The basic 
requirements are: (1) Run-time access to a compiler; (2) Dynamic code loading; and (3) 
Meta information. The only problematic implementation part is the support for run-time 
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type generation. We solve this problem by patching the corresponding type descriptor. In 
order to validate the claim that our composable message semantics can be applied to other 
languages and platforms, we implemented another prototype in Java. This prototype offers 
similar functionality but uses another implementation technique to offer the desired 
flexibility. Whenever an object is either imported or exported, our run-time system 
generates the necessary code snippets with help of the standard Java compiler ’javac’. This 
means that we generate Java source code and load the class file that is generated by the 
compiler. This approach is quite time consuming (0.2 - 0.4 seconds for stub generation). 
A more efficient implementation of our composable message semantics has to use another 
technique, e.g. a custom class loader that generates the needed class files by instrumenting 
the original files at load-time. Keller and Holzle [13] describe such a class loader that 
allows load-time instrumentation of class files. Using this technique it is possible to 
achieve performance comparable to our Oberon implementation. 
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Abstract. An enhancement to modular languages called module embedding 
facilitates the development and utilization of secure generic parallel algorithms. 



1 Introduction 

We have designed and implemented a strictly typed modular language framework that 
supports the specification of generic parallel algorithms and the derivation of specific 
parallel applications from such generic algorithms. The focus of our research is on 
message-passing parallelism and cluster computing applications. 

A generic parallel algorithm encapsulates a common control structure, such as a 
master-server network, a pipeline, a cellular automaton, or a divide-and-conquer tree. 
Such a generic algorithm can be used to derive parallel solutions for a variety of 
problems. The key idea is that the generic algorithm must provide complete 
coordination and synchronization pattern in problem-independent manner, while its 
clients must provide only problem-specific sequential code in order to derive specific 
parallel applications. 

In our language framework, generic parallel algorithms and their applications are 
specified as modules. A particular application can be derived from a generic algorithm 
by means of module embedding, a code reuse mechanism that enables the building of 
new modules from existing ones through inheritance, overriding of procedures, and 
overriding of types [11]. 

We have incorporated module embedding in the experimental language 
Paradigm/SP. Our language is an enhancement of SuperPascal, a high-level parallel 
programming language developed by Hansen [5]. In addition to embeddable modules, 
the language provides standard message-passing parallel features, such as send, receive, 
for-all, parallel statements, and channel types. We have developed a prototype compiler, 
which generates abstract code, and an interpreter for this abstract code. 
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We use Paradigm/SP to specify general parallel paradigms and to derive particular 
parallel applications from such general paradigms. We use the Paradigm/SP compiler 
and interpreter to test such paradigms and their derived applications. Once we have 
established the validity of a Paradigm/SP program, we convert it into efficient C code 
that runs on top of a cluster-computing library, such as PVM. 

We agree with others [3] that „...for a parallel programming language the most 
important security measure is to check that processes access disjoint sets of variables 
only and do not interfere with each other in time-dependent manner“. We have adopted 
in Paradigm/SP an interference control scheme that allows secure module embedding in 
above sense. The Paradigm/SP compiler guarantees that processes in derived parallel 
applications do not interfere by accessing the same variable in time-dependent manner. 

In this paper, we introduce the concept of embeddable module and show how a 
generic parallel algorithm can be specified as an embeddable module. We demonstrate 
how module embedding can be employed to derive specific parallel applications from 
generic algorithms. We also explain how module embedding guarantees that processes 
in derived applications do not interfere by reading and updating the same variable. 



2 Specification of Generic Parallel Algorithms as Embeddable 
Modules 

An embeddable module encapsulates types, procedures (and functions), and global 
variables. Module embedding enables building of new modules from existing ones 
through inheritance, and through overriding of inherited types and procedures. An 
embedded module inherits entities that are exported by the embedded module and 
further re-exports them. A principal difference between module embedding and 
module import is that an embedded module is contained in the embedding module and 
is not shared with other modules, while an imported module is shared between its 
clients. Another difference is that a client module cannot override types or procedures 
that belong to an imported module, while an embedding module can override types 
and procedures that belong to an embedded module. 

Type overriding allows a record type that is inherited from an embedded module to 
be redefined by the embedding module by adding new components to existing ones. 
Type overriding does not define a new type but effectively replaces an inherited type 
in the embedded module (i.e., in the inherited code) itself. In contrast, type extension, 
and similarly, sub-classing, define new types without modifying the inherited ones. 
Further details on module embedding and type overriding can be found in [1 1]. 

We demonstrate the applicability of module embedding to generic parallel 
programming with a case study of a simplified master-server generic parallel algorithm. 
The master-server generic algorithm (Fig. 1) finds a solution for a given problem by 
means of one master and n server processes that interact through two-way 
communication channels. The master generates a version of the original problem that 
is easier to solve and sends it to each server. All servers solve their assigned problems 
in parallel and then send the solutions back to the master. Finally, the master 
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summarizes the solutions provided by the servers in order to find a final solution to the 
original problem. 

The generic parameters of the master-server algorithm (Fig. 2) include the type of 
the problem to be solved, the type of its solution, and three sequential procedures: 

- a procedure to generate an instance of the problem that is to be solved by server i; 

- a procedure to solve a particular instance of the problem; 

- a procedure to summarize the set of solutions provided by the servers into a final 
solution. 

The generic master- server algorithm provides its clients with a procedure to 
compute a solution of a specific problem. The compute procedure incorporates the 
master and server processes, but those are not visible to the clients of the generic 
algorithm. 

In Figure 3, all components of the master-server generic algorithm are encapsulated in 
an embeddable module, MS. The export mark [3] designates public entities that are 
visible to clients of module MS. Unmarked entities, such as master and server, are 
referred to as private. The types of the problem and the solutions are defined as empty 
record type (designated as double-dot, „ .. „). Clients of module MS can (1) extend such 
inherited record types with problem-specific components, (2) provide domain- specific 
versions of procedures generate, solve and summarize, and (3) use procedure compute to 
find particular solutions. 




Fig. 1. Generic master-server algorithms 



type problem = ..; 
solution = ..; 

set = array [l..n] of solution; 
procedure generate( i: integer; 

p: problem; var pO: problem ); 
procedure solve( 
pO; problem; var s; solution ); 
procedure summarize( p: problem; 

b: set; var s; solution ); 
procedure compute( 
p: problem;\ar s; solution); 



Fig. 2. Generic parameters 
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module MS; 


procedure master} c: net; 


const n = 10; {number of servers} 


p: problem; var s: solution ); 


type 


var i; integer; 


problem* = ..; solution* = ..; 


pO; problem; b; set; 


set* = array} l..n] of solution; 


begin 


channel = *( problem, solution); 


for i := 1 ton do begin 


net = array [l..n] of channel; 


generate}! p, pO); 
send}c[i], pO); 


procedure solve*} 


end, 


pO: problem; var s: solution ); 


for i ;= 7 to M do 


begin end; 


receive}c[i], b[i]); 
summarize}p, b, s); 


procedure generate *( i: integer; 


end, 


p: problem; var pO; problem); 


begin / default: j pO := p; end; 


procedure compute *( 
p: problem; var s; solution); 


procedure summarize*} 


var c; net; i: integer; 


p: problem; 


begin 


b: set; var s: solution ); 


for i := 1 ton do open}c[i]); 


begin end; 


parallel 

master}c, p, s) \ 


procedure server} c; channel); 


forall i := 1 ton do 


var pO; problem; sO; solution; 


server}c[ij) 


begin receive}c, pO); 


end; 


solve}pO, sO); 


end; 


send}c, sO); 


end; 


begin end. (MSj 



Fig. 3. Embeddable module master-server, MS. Public entities are marked by 



3 Derivation of Specific Parallel Algorithms by Means of Module 
Embedding 

A parallel generic algorithm is a common parallel control structure (such as master- 
server) in which process communication and synchronization are specified in a 
problem-independent manner. Clients of the generic algorithm can derive particular 
applications from the generic algorithm by extending it with domain specific 
sequential algorithms. When the generic algorithm is specified as a module, the 
derivation of specific applications can be achieved by means of module embedding. 
An application module can embed the generic master- server module and override 
relevant entities that are inherited from the embedded module, giving them more 
specialized meaning. This is explained in details in the next section. 
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3.1 Derivation of Parallel Integration Application 

Consider, for example, the problem of deriving a simple parallel integration algorithm 
based on the trapezoidal method. This can be achieved by extending module MS into a 
module 77 (Fig. 4). The embedding module, TI, inherits the components of the base 
module, MS and re-exports all inherited public entities. Besides, module TI introduces 
a new generic parameter/, the function to be integrated that should be supplied by 
clients of TI. 

The embedding module, TI, overrides the inherited type problem, so that the new 
problem definition incorporates the lower and upper limits a, b of the integral to be 
calculated. Similarly, TI overrides the inherited type solution, so that the new solution 
definition incorporates the integral value v. Note that problem and solution were 
originally defined in module MS as empty record types. Overriding of non-empty 
record types is also permitted, as illustrated in the next section. 

The embedding module also overrides the inherited default version of procedure 
generate and the inherited ‘null’ versions of procedures solve and summarize. The 
newly declared version of generate divides the integration range into n equal parts, 
one for each server. Procedure solve is defined in TI to be trapezoidal integration. 
Procedure summarize sums-up the partial integrals provided by the n servers. 



module TI(MS); 


. . . complete implementations of 


type problem* = 


procedures generate and 


record a*, b*: real; end; 


summarize... 


solution* = record v*.' real; end; 


end. (module TI} 


function /*(x; real): real; 

begin end; 


module lA(TI); 

var p: problem; s; solution; 


procedure solve*(p0; problem; 




var s; solution); 


function /*( x; real): real; 


begin s.v .■= ((pO.b - p0.a)/2) * 


begin /.•= x * sin(sqrt(x)); end; 


(fipO.a) -\-f(p0.b)); 




end; 


begin compute(p, s) end. (LAj 



Fig. 4. Derived modules trapezoidal integration, TI, and integration application, lA. 



Module TI can be embedded on its turn into a specific integration application 
module, lA, that defines a particular function /to be integrated. Module lA serves as a 
main program by invoking procedure compute that is provided by the generic MS 
module (Fig. 4). 
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3.2 Derivation of Parallel Simulated Annealing and Traveling Salesperson 
Algorithms 

A variety of specific parallel algorithms can be derived from the same general parallel 
generic algorithm. For example, we have derived a generic algorithm for approximate 
optimization that is based on simulated annealing, organized as module SA (Fig. 5). 
Note that the definition of type annealingPoint contains a component, dE, that is 
needed for all possible application of simulated annealing.. 

The generic simulated annealing algorithm can be used to derive approximate 
algorithms for different intractable optimization problems. For instance, we have 
derived a parallel algorithm for a particular traveling salesperson problem (module 
TSP in Fig. 5). Note that the inherited definition of type annealingPoint is overridden 
in TSP by adding two new problem-specific components, i, j, to the inherited 
component dE. 



module SA(MS); 


module TSP(SA); 


type 




problem* = record 


type 


. . . annealing parameters. . . 


city = record x, y: real end; 


end,' 


tour = array [l..m] of city; 


annealingPoint* = record 


solution* = record t; tour; end; 


dE*: real; 


annealingPoint* = record 


end,' 


...field dE inherited from SA... 




i, j: integer; 


. . .procedures select and change 


end; 


declared as generic parameters. . . 






var p: problem; s; solution; 


procedure solve*( 




pO: problem; var s; solution ); 


. . . complete implementations of 


...complete implementation that 


procedures select, change. 


performs simulated annealing 


summarize... 


using the generic parameters select 




and change. . . 


begin compute(p, s) end. fTSPj 


end. fSAj 





Fig. 5. Derived modules simulated annealing, SA, and traveling salesperson, TSP. 



4 Interference Control for Embeddable Modules 



When a parallel application is executed repeatedly with the same input, the relative 
speeds of its constituent parallel processes may vary from one execution to another. If 
one parallel process updates a variable and another process updates or reads that same 
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variable, the order in which those processes access the variable may vary from one 
execution to another, even when the input for the parallel application do not change. 
Such parallel processes are said to interfere with each other in a time dependent 
manner due to a variable conflict. Interfering parallel processes may update and 
possibly read the same variable at unpredictable times. The output of an application 
that contains interfering parallel processes may vary in an unpredictable manner when 
the application is executed repeatedly with the same input. Such an application is said 
to be insecure due to a time-dependent error. A secure parallel programming 
language should allow detection and reporting of as many time-dependent errors as 
possible. The implementation may efficiently detect time-dependent errors through 
process interference control at compile time and, less efficiently, at run time. 

Hansen [4] advocated the benefits from interference control and developed an 
interference control scheme for the parallel programming language SuperPascal. The 
SuperPascal language is subject to several restrictions that allow effective syntactic 
detection of variable conflicts, i.e., detection at compile time. These restrictions apply 
to a variety of language constructs and assure that a variable that is updated by a 
parallel process may be read only by that process. Note that parallel processes are 
allowed to read-only shared variables. 

For each statement belonging to a SuperPascal program, the compiler determines 
the target variable set and the expression variable set of that statement. The target 
variable set consists of all variables that may be updated during the execution of the 
statement, while the expression variable set consists of all variables that may be read 
during that statement’s execution. In SuperPascal, processes are created by parallel 
and forall statements. A parallel statement parallel | 5^ | ... end incorporates n 
process statements Sj, S^, ... such that the target variable set of S. is disjoint with the 
target and expression variable sets of Sj, ... S^, S.^j, ... S^, i = 1, 2, ... n. A forall 
statement forall i := m to n do S incorporates a single element statement S which 
generates n-m-\-l processes and, for this reason, is required to have an empty target 
variable set. 

It should be noted that the above restrictions on target and expression variable sets 
are very natural for parallel applications running in a cluster computing environment. 
Processes that are generated by a forall statement will run on separate cluster nodes. If 
such processes were to share a target variable, it could be quite hard and inefficient to 
synchronize that shared access over a network. At the same time, it is easy to make 
these processes efficiently share read-only variables by broadcasting those variables 
values just once to all processes. Similar considerations apply to processes that are 
generated by a parallel statement. 

A SuperPascal parallel application consists of a single main program, exactly as in 
the standard Pascal language. The interference control scheme of SuperPascal [4] 
guarantees that single-module parallel applications do not contain time-dependent 
errors, i.e., they are secure in this sense. The Paradigm/SP language has been designed 
as an extension to SuperPascal that introduces separately compiled embeddable 
modules [11]. We have extended the single-module interference control scheme of 
SuperPascal to serve the specific requirements of Paradigm/SP. 
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In SuperPascal, procedures are never overridden. Therefore, the target and 
expression variable sets for procedure statements can be determined during the 
compilation of SuperPascal’ s single module parallel applications. This is not the case 
in a language with embeddable modules, such as Paradigm/SP; procedures that are 
defined in an embeddable module MO can be overridden in an embedding module Ml. 
The overriding procedures that are defined in Ml may have different target and 
expression variable sets from those in MO. Therefore, procedure statements in the 
embedded module MO, a module that has already been separately compiled, may have 
their target and expression variable sets changed by procedure overriding in Ml. Thus, 
restrictions on target and expression variables sets that have been validated during the 
compilation of MO may be violated later, when MO is embedded in Ml . 



module MO; 


module Ml (MO); 


procedure p*(j: integer); 


var k: integer; 


begin end,- 


procedure p*(j; integer); 


begin 


begin k := j end; 


forall / .-= 7 to 70 do p( i); 


begin k := 0 end. {Ml } 


end. {MO} 





Fig. 6. Modules MO and Ml . 

Consider, for example, module MO from Fig. 6 that defines and exports procedure 
p. Module MO contains a statement forall i := 1 to 10 do p(i) that generates processes 
by executing the procedure statement p(i). The procedure statement p(ij has an empty 
target variable set; therefore, its generated processes do not interfere due to variable 
conflicts. 

Assume now that module MO is embedded in module Ml and that Ml overrides the 
inherited procedure p, as illustrated in Fig. 6. The overriding body of p may have 
access to a global variable, k. Therefore, the target variable set of the procedure 
statement p(i) in the separately compiled module MO will now actually contain the 
variable k, and will, therefore, be non-empty. 

The main difficulty to interference control in a language framework with module 
embedding comes from the possibility to change, through procedure overriding, the 
target and expression variable sets in embedded modules that have been already 
separately compiled. We remedy this problem by introducing additional restrictions 
that make it impossible to modify variable sets during procedure overriding. More 
precisely, we exclude the so-called unrestricted procedures (and functions) from 
parallel and forall statements, as explained below. 

A procedure that is declared in a module can be marked for export with either a 
restricted mark or an unrestricted mark A procedure exported by a module MO 
can be overridden in an embedding module Ml, provided that the procedure heading 
in Ml is the same as in MO (in particular, the export mark, or must be the 
same). 

A restricted procedure is a procedure exported with a restricted mark, 
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An unrestricted procedure is: 

- a procedure that is exported with an unrestricted mark, or 

- a private procedure that invokes an unrestricted procedure, or both 

A restricted procedure is not permitted to use global variables (directly or 
indirectly), and invoke unrestricted procedures. In contrast, an unrestricted procedure 
can use global variables and invoke unrestricted procedures. 

Overriding an unrestricted procedure p in an embedding module Ml may change 
target and expression variable sets in a separately compiled embedded module MO, 
because the overriding procedure is allowed to access global variables. This is why 
parallel statements and forall statements are not permitted to invoke unrestricted 
procedures. This requirement is in addition to the limitations on target and expression 
variable sets in parallel and forall statements, as defined earlier in this section. 
Restricted procedures are not excluded from parallel and forall statements because, in 
contrast to unrestricted procedures, they cannot modify target or expression variable 
sets in separately compiled embedded modules. 

There are also procedures that are neither restricted nor unrestricted, such as, for 
example, private procedures that use global variables but do not invoke unrestricted 
procedures. This category of procedures may participate in parallel and forall 
statements as well, as far as they comply with the limitations on target and expression 
variable sets, as discussed earlier in this section. 

Consider again the example modules in Fig. 6. Procedure p is declared with a 
restricted mark, Therefore, accessing a global variable such as k in Ml is a syntax 
error. Procedure p would be allowed to access a global variable if p was declared with 
an unrestricted mark, In such a case, however, the use of p in a. forall statement 
like as the one in MO would be a syntax error.. 

The exclusion of unrestricted procedures from parallel and forall statements 
permits syntactic detection of variable conflicts in separately compiled modular 
parallel applications. The Paradigm/SP compiler guarantees that a variable that is 
updated by a process cannot be used by another process, while sharing read-only 
variables is permitted. Paradigm/SP parallel applications may not be insecure due to 
variable conflicts. 

Is the exclusion of unrestricted procedures form parallel and forall statements a 
serious practical limitation? Technically, it means that if an exported procedure is used 
to generate a process, and if it needs to access global variables, it must do so through 
explicit send/receive statements or through parameters, rather than directly. We are 
convinced that this restriction is quite natural in the domain of message passing cluster 
algorithms, because parallel access to global variables from different processes must 
be implemented through send/receive, anyway. Programmers who are forced to 
implement access to global variables through explicit send/receive statements are 
more likely to be aware of the underlying inefficiency of such access, in contrast to 
programmers for whom implicit message passing is generated by the implementation. 
Our experiments with four generic algorithms and several derivatives from each of 
them make us believe that the exclusion of unrestricted procedures form parallelism is 
not a serious practical limitation. 
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5. Conclusions 

This paper outlines module embedding, a form of inheritance that applies to modules 
and that permits overriding of inherited types. Embeddable modules have been 
incorporated in a parallel programming language called Paradigm/SP. A prototype 
implementation of Paradigm/SP has been developed and documented [12], 
Paradigm/SP has been used to specify generic parallel algorithms and to derive 
concrete parallel applications from them by means of module embedding. 
Paradigm/SP has been used as a higher-level prototyping language in order to 
conveniently test the validity of derived parallel applications before finally converting 
them into efficient C code that runs in a cluster-computing environment, such as 
PVM. 

We have specified several generic parallel algorithms as embeddable modules, such 
as a probabilistic master-server [10], a cellular automaton [9], and an all-pairs pipeline 
[8]. Though module embedding, we have derived diverse parallel applications from 
such generic algorithms. Despite of the use of generic parallelism, most of the derived 
applications have demonstrated very good performance in cluster-computing 
environments, and a couple of derived applications have achieved super linear speed- 
up [8]. 

We have adopted interference control scheme for embeddable modules. This 
scheme guarantees that processes in derived applications do not interfere by reading 
and updating the same variable. That derived algorithms are secure in this sense is 
what makes module embedding unique in comparison to traditional object-oriented 
techniques supported by C-H-, Java, Corba, etc., where no static control helps 
programmers to avoid time-dependent errors in derived algorithms. For example, it 
has been recognized that Java multithreaded applications are inherently insecure 
because nothing prevents different threads from invoking unsynchronized methods [3]. 
A related insecure feature of Java is that data members are by default protected and 
that protected data members can be accessed from all classes that belong to the same 
package. For these reasons, it easy to gain access from different threads to protected 
data members by adding new classes to a package and to create applications that are 
insecure due to time-dependent errors. 

Others have proposed dynamic load-time class overriding through byte-code editing 
[6]. This technique is justified by the so-called adaptation and evolution problems that 
appear when sub-classing is used to build software components. Our approach has the 
merit of integrating type overriding within the programming language and its 
compiler. 
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In traditional modular object-oriented languages, such as Oberon-2, Ada-95 and 
Modula-3, modules are not embeddable, while classes are represented by means of 
extensible record types [15]. What is different in our approach to classes is that record 
type extension overrides an existing type (both in the new embedding module and in 
the existing embedded module) and does not introduce a new type. A disadvantage of 
embeddable modules as compared to classes is that modules do not introduce types, 
and therefore cannot be used to create multiple instances. Furthermore, inherited type 
overriding imposes additional run-time overhead on the implementation. It has been 
recognized [7], [13] that both modules and classes support necessary abstractions, 
which should be used as a complementary techniques. 

A collection of object-oriented language features that support the development of 
parallel applications can be found in [1], [2]. Parallel programming enhancements of a 
mainstream language, C-H-, are presented in [14]. A survey of earlier object-parallel 
languages is contained in [16]. An example of template-based genericity is contained 
in [17]. We do not know of a traditional object-oriented language that performs static 
analysis in order to guarantee that parallel applications are free of time-dependent 
errors. The main benefit of module embedding is that it guarantees at compile time the 
lack of such errors and that its static interference analysis scheme eliminates the 
overhead of run-time synchronization. 

Paradigm/SP is a specification and prototyping language and as such is simpler than 
production languages and environments. Algorithm developers may focus on what is 
essential in their developed parallel control structures and application methods without 
being burdened by the complex details that are required for efficient practical 
programming. Simplicity and ease of use are advantages of Paradigm/SP as an 
algorithm development and validation language in comparison to production 
languages and environments. 

As a continuation of this project in the future, we envision that it would be possible 
and beneficial to develop an interference control scheme for multithreaded Java 
applications. A Java source code analyzer may be used to discover variable conflicts 
between threads and to help eliminated time-depending errors due to such conflicts. 

If algorithms are to be published on the web, they can be shaped as multimedia 
web-pages. A separately compiled module can be shaped as a source html file that can 
be fed into a compiler in order to produce executable code. Module Import and 
embedding can be designated by means of hyper-links. Source modules that comprise 
an application can reside on different servers. These same servers can host 
corresponding distributed executable objects. The design of adequate language and 
compiler support is another possible continuation of this project in the future. 
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Abstract. For the past few years we have been working on a parallel 
programming language, Mianjin, suitable for writing parallel programs 
for non-dedicated networks of workstations. This paper reviews an in- 
novative feature of the language, a type system that statically enforces 
global system behaviour. This is achieved by typing the behaviour of 
commands, thereby differentiating commands that may admit communi- 
cation from those that do not. Doing this guarantees safe asynchronous 
communications; in particular it prevents deadlocks caused by exhau- 
stion of system-level communication resources (such as buffers) which 
are beyond an application programmer’s control. These command ty- 
pes propagate though client and library code thereby simplifying some 
problems associated with constructing software components. The type 
system is semi-formally described using type rules, and some further 
applications of the idea to software components are discussed. 



1 Introduction 

For the past few years we have been researching and developing a parallel lan- 
guage called Mianjin^, for programming non-dedicated networks of workstations 
(the Gardens project [5]). Since the original publication describing Mianjin [4] 
the language has been refined and improved. We have come to realise that its 
control of global system behaviour by typing the behaviour of commands is par- 
ticularly innovative and useful, and may be applicable to other situations. This 
paper describes this feature semi-formally, and how it might be applicable to 
controlling other forms of global system behaviour which complicate the deve- 
lopment of large software systems. 

Mianjin is an object oriented language in the Pascal tradition; it is based 
on Oberon-2, with some extensions. It is designed for parallel programming 
so the efficient control of tasking and communication is vital. For this reason 

^ ‘Mianjin’ means place of the blue water lilies; it is an Aboriginal name for Gardens 
Point, where QUT is located. 



J. Gutknecht and W. Week (Eds.): JMLC 2000, LNCS 1897, pp. 38-50, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 




Mianjin: A Parallel Language with a Type System 



39 



Mianjin supports distributed objects for communication, these have some simi- 
larities with Java RMI. Notable differences from Java RMI are its efficiency, 
asynchronous remote method call and atomicity typing; the latter is the subject 
of this paper. Atomicity typing statically controls object reentrance. This pre- 
vents deadlocks caused by exhaustion of system-level communication resources 
(such as buffers) which are beyond an application programmer’s control. Note, 
this approach does not aim at reducing the computational power of the lan- 
guage; application-level deadlocks are not excluded, only implicit system-level 
ones. Other related languages and systems include CC-I— h, Charm-|— 1-, Orca, 
pSather, SplitC, CORBA, and DCOM. SplitC [2] is closest in that its commu- 
nication model is also based on asynchronous send combined with non-selective 
synchronous receive, but it is not type safe. Note in this paper we only describe 
and formalise a simplified version of Mianjin. 

The main contribution of this paper is to show how the global behaviour 
of a system, in particular certain forms of object reentrance, may be statically 
controlled through a type system, and to present a semi-formal description of 
Mianjin’s type system, which does this. 

The next section overviews Mianjin’s support for distributed objects. Sec- 
tion 3 discusses general issues of object reentrance. Section 4 introduces atomi- 
city typing and justifies the overall design decisions underpinning the aspects of 
Mianjin as presented in this paper. Section 5 presents a semi-formal description 
of Mianjin’s type system, which controls object reentrance - a global system 
property. Section 6 describes how these features may be used to solve other 
problems, and the final section concludes. 



2 Distributed Objects in Mianjin 



Communication in Mianjin is realised via asynchronous global (remote) method 
calls, c.f. Java RMI [6]. Like some other languages Mianjin differentiates local 
from potentially global objects. Global objects are ordinary local objects which 
have been exposed globally. Through such global references, objects are subject 
to a restricted set of operations: tests for equality, locality tests/casts and invoca- 
tion of global methods. Global methods are ordinary methods, again subject to 
constraints. Locally, objects support a superset of these global object operations 
including direct field access. 

Here is a simple example of a local object which may be exposed globally. 
Note we use the syntax and semantics of our toy mini-Mianjin language, for- 
malised in Section 5. In particular all objects are heap-allocated and object 
types, records in Oberon terminology, are reference types, as in Java [7]. The 
full Mianjin language has syntax and semantics based on Oberon-2. The exam- 
ple declares an object type that supports a single method Add and has two fields 
count and sum. An Acc object represents a form of accumulator to which other 
tasks may contribute values via Add. The client code uses the object remotely. 
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RECORD Acc(ANYREC) 


(* 


a record = class 




count : INTEGER 


(* 


a field 




sum: INTEGER 


(* 


another field 




GLOBAL METHOD Add (s: INTEGER) 
BEGIN 


(* 


a global method 





THIS. sum := THIS. sum + s 
THIS. count := THIS. count - 1 
END Add 



END 

VAR gsum: GLOBAL Acc (* client code *) 

gsum. Add(42) 

For an object like gsum to be used remotely its type must be labelled GLOBAL. 
The GLOBAL type annotation enforces that only global methods such as Add may 
be invoked on gsum, the record fields and any non-GLOBAL methods (of which 
there are none) are not supported. 

Local objects, the default, are always located locally. Global objects may be 
either located remotely or locally. Thus an implementation of the call 
gsum. Add (localval) will result in a communication if gsum is not local. In Mi- 
anjin global method invocations are asynchronous. However since synchronous 
invocation is just a special case, a Mianjin implementation is free to invoke glo- 
bal methods on local objects synchronously. In the example, if gsum happens to 
refer to a local object, then gsum. Add can be executed as a normal synchronous 
call. 

Thus Mianjin supports location-transparent communication via global ob- 
jects and their associated global methods. Global object references are valid 
across machines and hence are location independent. They can therefore be 
communicated freely among tasks running across a distributed system. In gene- 
ral local objects are compatible with global ones but not vice versa, other than 
through an explicit (and checked) runtime type cast. 

Issues such as tasking are system-specific. Other issues such as returning 
values from global methods to implement efficient read operations are beyond 
the scope of this paper and discussed elsewhere [4]. Here we focus on the core 
issue of controlling object reentrance, and, to a lesser extent, local versus global 
objects. 

3 Object Reentrance: A Global System Property 

Reentrant calls into state-based abstractions introduce a number of well-under- 
stood problems [8]. These problems, fundamentally, are all rooted in the ab- 
straction’s implementation being in some intermediate state when it made the 
outcall that eventually led to the reentrant call. Designing and implementing 
object-oriented programs in a way that such reentrancy does not lead to errors 




Mianjin: A Parallel Language with a Type System 



41 



is known to be difficult. Matters get worse when the state held by a reentered 
object is not actually managed by that object itself. Since object reentrance 
may occur through complex call sequences involving third party code it must be 
controlled globally within a system; it is not a local property. 

Let us consider what happens in the case of an unconstrained language with 
asynchronous messaging similar to Mianjin. Assume that the underlying com- 
munication mechanism needs to allocate a buffer whenever a method is called 
and can only release the buffer once that method returned (methods are asyn- 
chronous). If such buffers are allocated out of a system-managed buffer pool, 
then the limit on the number of available buffers is not controlled by the ap- 
plication. If the communication system runs out of buffers, it blocks and waits 
for buffers to become available. However, if recursive remote method invocations 
are allowed, then such waiting for buffers can easily lead to a deadlock. 

It might seem that this form of reentrancy is a special case of the more 
general situation of concurrent calls. Therefore, it would seem that traditional 
concurrency control and synchronisation mechanisms can be used to solve this 
problem. Unfortunately, this is not true and concurrent invocations and reentrant 
invocations are two different problems entirely. To see why, assume that some 
form of locking was used in an attempt to control reentrant calls. Since the 
reentrant call is logically made by the same thread that entered the now-locked 
section in the first place, an immediate deadlock would follow. To break this 
tie of “self-inflicted” deadlocks, languages such as Obliq [1] and Java allow for 
repeated granting of a lock to the same thread. More generally, techniques for 
handling concurrent calls are in principle unable to handle reentrant calls. 

Another solution to problems caused by reentrant calls is to not allow them. 
For such an approach to work, reentrancy must be ruled out statically, or else 
a very hideous form of run-time-detected error results. A straightforward static 
approach would be to simply disallow any method call on another object from 
within a method’s body. This is clearly not useful, since objects would not be 
able to call on other objects at all. However, only a slight modification of this 
approach leads to a scheme that proved to be useful and practical in the context 
of Mianjin. 



4 Atomicity Typing 

In Mianjin we prevent the problems mentioned in the previous section of un- 
constrained object reentrance by typing the atomicity of commands. We sort 
methods into three types: atomic, global, and polling methods. Atomic methods 
perform no communication and accept no messages (do not poll) , global methods 
represent methods which may be implemented remotely and hence their imple- 
mentation must not accept other messages nor communicate. Polling methods 
may perform communications and poll. Given this categorisation of methods 
we type commands in Mianjin, e.g. method calls, by their atomicity behaviour: 
either atomic or polling. This results in the following typing for method declara- 
tions: atomic methods can only call atomic commands; global methods can only 
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call atomic commands; polling methods can call polling, or atomic commands. 
The only other similar system we know of is the restriction in C++ and Java 
that a method can only call methods with declared exceptions, if it itself either 
handles these exceptions or declares to throw a superset of them. 

Mianjin assumes a model that enables an implementation of global method 
calls over message passing networks. Code that calls a global method and code 
that polls for incoming requests must therefore adhere to policies that prevent 
deadlocks caused by mutual waiting for network buffer resources. Atomicity ty- 
ping statically guarantees that Mianjin code follows one such policy. Since global 
methods can neither call other global methods nor cause the polling for incoming 
calls, a global method cannot cause further pressure on network resources while 
it is executing. Deadlocks are thus statically prevented. 

A global method can only call atomic code (code that does not communicate), 
but could post a continuation action that could communicate on its behalf. 
Once the global method returns, some scheduler can call pending continuation 
actions. For the sake of allowing (deferred) communication to follow incoming 
messages, the problem of network-resource pressure has thus been converted 
into one of application-resource pressure (to hold on to pending continuations). 
In the end, this is unavoidable for applications built using a partially recursive 
computational model can never be statically kept from running out of resources 
or deadlocking. However, deadlocks must not be caused by resource pressure 
that the application itself cannot control. 

The Mianjin model opts for a remote method invocation model that ena- 
bles an asynchronous implementation, but is carefully balanced to not require 
asynchronous invocation. This is most important when aiming for a flexible dis- 
tribution of objects across networked machines. Asynchronous communication 
naturally masks the potentially significant latency of remote calls without re- 
quiring the use of additional threads at the calling end. At the same time, local 
calls are much more efficiently executed synchronously. 

Potentially asynchronous calls need to be part of an appropriate commu- 
nication model. Besides restricting signatures of methods that are to support 
potentially asynchronous calls (no out, inout, or return values), the receiving 
end also needs attention. Synchronous receive is required to avoid heavy use 
of multiple threads. To avoid deadlocks, receive operations must not wait for a 
specific call or else the automatic creation of threads would again be required. 
The Mianjin polling mechanism synchronously requests delivery of all pending 
calls, but does not block the system if there are no calls pending. It therefore 
enables a model that keeps threading orthogonal to asynchronous computing. 
(At a library rather than language level, this approach was pioneered by Active 
Messages [9].) 

Requiring programmers to think in terms of abstractions that may and those 
that definitely will not communicate introduces a new burden. The same is true 
for the requirement to use continuations when trying to communicate from wit- 
hin a non-communicating abstraction, such as a message receiver. However, be- 
sides static safety gains, there is a significant gain as well: since abstractions are 
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now statically classified, a programmer can rely on non-communicating abstrac- 
tions not causing unexpected reentrancy as a result of communications hidden 
by an abstraction’s implementation. Thus, the programmer now knows when 
exactly it is necessary to reestablish invariants over task-global data structures 
before calling on some abstraction. Doing so is required exactly if the called ab- 
straction may communicate, i.e. poll, since system-level deadlocks can only be 
avoided if every attempt to send a message is paired with a promise to be able 
to receive a message. 

In combination, the resulting Mianjin model is much simpler and cleaner 
than strictly synchronous models, including CORBA^, Java RMI, and DCOM. 

5 A Semi-formal Description of Mianjin 

The previous sections introduced Mianjin, its distributed object system and the 
idea of preventing object reentrance through atomicity typing of commands. 
This section semi-formally describes the core concepts of Mianjin: distributed 
objects and atomicity typing via a cut-down version of the Mianjin language. 

Mianjin proper is essentially an extension of Oberon-2. Note that in Mianjin 
and Oberon object types are termed records; here we use the two terms inter- 
changeably. Mini-Mianjin as described here omits the following features from 
the full language: 

— Pointers (all records are heap-allocated and reference-based as in Java) 

— Constants, arrays, functions, procedures, modules, type aliases 

~ VAR parameters, projection on subtyped values 

— Method overriding and super calls 

Only a limited set of commands and expressions are described, others follow 
quite naturally. In addition some of the syntax has been changed to simplify 
formalisation, e.g., implicit “this” records are used for the receiver object in 
methods and methods are defined inside records (object types). The abstract 
syntax is described in Figure 1. It shows the syntax categories and identifiers 
used in rules to range over those categories, for example a variable t always 
represents a type. 

The following wellformedness criteria apply to programs (we informally state 
these here to avoid cluttering rules): 

— All record names in types must exist in the environment. 

— All records must directly or indirectly extend ANYREC. 

~ Records may not add new methods or fields with the same names as ones 

defined in super types. 

— All record names in the environment must be distinct. 

^ CORBA supports “oneway” methods, but their semantics in the case of an asynchro- 
nous implementation is rather unclear. Commercial ORBs all implement “oneway” 
synchronously. 
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Program 


= Record* 


r G Record 


= RECORD Went (Went) Fields, Method* END 


fs G Fields 


= (Went : Type)* 


m G Method 


= Attrib METHOD Went ( Params ) 




VAR Locals BEGIN Cmd* END 


at G Attrib 


= Atomicity | GLOBAL 


a G Atomicity 


= ATOMIC 1 POLL 


Params 


= (Went : Type)* 


Locals 


= (Went : Type)* 


c G Cmd 


= NEW LExp 1 LExp : = Exp | 




LExp . Went ( Actuals ) | • • • 


Actuals 


= (Went : Type)* 


e G Exp 


= Exp = Exp 1 LExp 1 • • • 


le G LExp 


= Went 1 LExp. Went | THIS | LExp (Type) 


t, u, V, pt, It G Type 
f,p, 1, id, b G Went 


= Went 1 ANYREC | BasicType | GLOBAL Type 



Fig. 1. Abstract syntax 



~ All field and method names in each record must be distinct. 

— All parameters’ and locals’ names in each method must be distinct. 

In all contexts global annotations are idempotent and only effect records: 

GLOBAL (GLOBAL t) = GLOBAL t 
GLOBAL BasicType = BasicType 

We use two environments in our rules, one maps record names to records, R, 
the other maps variables to types, V : 

R : Went !->■ Record 
V : Went ^ Type 

The following relations extract a record’s name from a record declaration 
(recname), assert that a type is a valid defined record (isrec), and assert that 
one record extends another (extends). 

recname (record id (_) _, _ END) = id 

recname(ANYREC) = ANYREC 

isrec(i?, GLOBAL t) = isrec(R, t) 

isrec(i?, id) = isrec(i?, R{id)) 

isrec(i?, BasicType) = false 

isrec(i?, RECORD id{b) fs, ms END) = (RECORD id{b) fs, ms END) G range(i?) 
extends(RECORD _(6)_, _ END) = RECORD b{_)_, _ END 
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We now come to two key relations which define Mianjin. The atomicity re- 
lation (atomicity) governs the atomicity of method invocations given a method 
with a given attribute and object of a given type (local or global). The global 
relation represents the projection of the global type attribute across all para- 
meters of a global method (global). That is, the record formal parameters of a 
global method are all implicitly globalised. 

atomicity : (Attrib, Type) O Atomicity 
atomicity(GLDBAL, t) = ATOMIC, local(t) 

= POLL, otherwise 

atomicity(ATDMIC, t) = ATOMIC, local(t) 

= undefined, otherwise 
atomicity(POLL, t) = POLL, local(t) 

= undefined, otherwise 



global : (Attrib, Type) O Type 
global(GLOBAL, t) = GLOBAL t 

global(ATOMIC, t) = t 

global(POLL, t) = t 



local(GLOBAL t) = false 

local(f) = true, for all other t 



Given the previous abstract syntax and relations we can now define the type 
rules for our mini-Mianjin. For the various categories of syntax the type rules 
have the following forms: 



program and record declarations R\~ Program R\- Record 
method declarations RV\~ Method 

commands RV\- Cmd : Atomicity 

expressions RV\~ Exp : Type 

The type rules for program, record and method declarations are shown in 
Figure 2. Note, we use underscore for the wildcard “don’t care” pattern, elipses 
to denote sequences, and to denote mapping extension. The rule for methods 
is of particular interest; it states that: 



— If the method is declared to be global or atomic all its constituent commands 
must be atomic i.e. only a method declared as poll may invoke polling com- 
mands. (In the case of a global method its server side code must be atomic 
although the actually method invocation on a global object will not be - 
see the rule for method invocation.) This simple analysis is conservative, c.f. 
Java definite assignment. 

~ All constituent commands must type check, within an environment exten- 
ded with the parameters and local variable bindings. The parameters are 
implicitly globalised if the method is labelled as being global. 
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Program 

R = {recname(ri) i— >■ ri}i=i...„»{ANYREC i— ANYREC} 

R\- ri, , R\- r„ 
ri . . .r„ 

Record 

R{THIS i-> idjh mi, , R{THIS i-> idjh m„ 

Rh RECORD id (_) mi . . . m„ END 

Method 

V = V»{pi !->• g\ohal{at,pti)}i^i...m»{k ^ 

RV'\- Cl : ai, . . . , RV'\- Co ■ do 
at £ {GLOBAL, ATOMIC} ^ Vi=i...o : di = ATOMIC 
RVV- at METHOD _ (pi -.pti ...pm -ptm) VAR h :lti . . . l„: lt„ BEGIN Ci . . . c„ END 



Fig. 2. Rules for program, method and record declarations 



The type rules for commands are shown in Figure 3. The rule for method 
calls is of particular interest. It expresses the following: 

— The atomicity of the call depends on the attribute of the method being called 
and locality of the receiver object, as defined by the atomicity relation. 

— The method called must be declared in the receiver object’s type or super- 
type. 

— The method’s actual parameters must be subtypes of the formal parameters. 

— The formal parameter types are implicitly global if the method is global. 

— The receiver object and actual parameters must all be type correct. 



RV\- le : t isrec(f) & local(f) 

RV\- NEW le : ATOMIC 

RV\- le : t RV\- e : u R\~u^t 
RVhle ■.= €■. ATOMIC 

Rr fi^global(af,pti), • • • , R\~ tn^global(af,ptn) 
RV\- ei : ti, . . . , RV\- e„ : t„ 
at METHOD id (pi :pti . . .p„ -ptn) - END £ (mi . . . nio} 
R\- f^RECORD _ _ , mi . . . mo END 
RV\~le :t a = atomicity} af, t) 

RV\- le.id (ei . . . €„) : a 



Fig. 3. Rules for commands 






Mianjin: A Parallel Language with a Type System 



47 



The type rules for expressions (including left expressions) are shown in Fi- 
gure 4. Notice how only the fields of local objects may be accessed. Our simplified 
type cast is like Eiffel’s: if the cast fails, NIL is returned. It may be used to cast 
global records to local ones; in which case the cast succeeds if the record genui- 
nely is local. 



RV\~ El : t RV\~ E2 ■ u R\~ t^u V R\~ u^t 
RV^Ei=E2 : BOOL 

V{id) = t 
RV^ id : t 

(/ : t) e {fi-ti ■■■fn-.tn} 

R\- u^RECORD _(_)/i :ti END 

RV\- le : u local(M) 

RV'rle.f : t 

F(THIS) = t 
WhTHIS : t 

isrec(i?, u) isrec(i?, t) RV\~ le : t 
RVV- le{u) : u 



Fig. 4. Rules for expressions 



The relation ^ denotes subtyping; it is defined in Figure 5. Subtyping applies 
to both explicitly defined subtyping of records and to local/global attributes of 
objects: local objects may be substituted for global ones but not vice versa. 

6 Further Applications 

The notion of atomicity typing can be generalised to reentrancy classes. The 
idea is simple: allow for any number of command types subject to a partial 
ordering (subtype relation). The default command type is at ground-level, i.e., 
cannot invoke code with any non-default reentrancy type — this corresponds to 
the atomic type in atomicity typing. 

The observer pattern permits clients to attach to an object and to be notified 
when the object changes state. Often a synchronous notification is required i.e. 
all objects are informed of each state change, before subsequent state changes 
are permitted. A simple example is shown below, in pseudo code: 

ABSTRACT RECORD Oberver 
PROCEDURE Notify 

END 
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isrec(R, t) 

R\- i^ANYREC 

extends(t) = u isrec(R, t) isrec(_R, u) 
R\- t^u 

R\- 

R\- t^u Rr uRv 
R\- t^v 

R\- t^u 

R\- t^GLOBAL u 
R\- t^u 

R\- GLOBAL i^GLOBAL u 



Fig. 5. Rules for subtyping 



RECORD Value 
val: INTEGER 
obs : ARRAY OF Observer 
numobs ; INTEGER 

PROCEDURE GetO: INTEGER 
BEGIN 

RETURN THIS. val 
END Get 

PROCEDURE SetCnewval; INTEGER) 

VAR i: INTEGER 
BEGIN 

THIS. val := newval 

FDR i := 0 TO numobs-1 DO THIS. obs. Notify END 
END Set 

END 



Problems can arise if an implementation of Notify calls Set since observers 
may no longer be synchronously informed of each state change. The key to 
solving the problem is to prevent Notify or any routine in its implementation 
from calling Set. It may however call Get. This is similar to Mianjin’s insistence 
that ATOMIC code may not call POLL. We can envisage a system which supports 
the control of the contexts in which method calls may be made. Consider the 
following example: 
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ABSTRACT RECORD Dberver 

PROCEDURE Notify, RESTRICTION(Value . Set) 

END 



Similar to the GLOBAL attribute in Mianjin, the RESTRICTION attribute sta- 
tes that any implementation of Notify must not, directly or indirectly, call 
Value. Set. There are a number of issues with this approach: 



— Restrictions (intentionally) interfere with procedural composition. It is thus 
important to pick a default that allows the coding of restriction-agnostic 
procedures/methods . 

— The restriction must be declared when method is declared, and cannot be 
done afterwards. In the above example it would therefore be better (more 
general) to restrict Notify from calling a more general mutation interface 
and then to declare that Value . Set implements this mutation interface. 

— Restrictions (intentionally) go against abstraction. Retaining the necessary 
restriction conditions across abstraction boundaries is the whole purpose of 
such annotations. Refinements, such as discussed in the previous point, can 
help in reducing the unnecessary impact on abstraction. 

— Like most simple type-system approaches, restriction annotations affect all 
objects of a given type, rather than specific instances involved in some scena- 
rio. For example, it might be perfectly acceptable for a notifier to modify 
some other Value instance. For a finer-grained level of control, dependent 
types or alias-controlling type systems could be used [3]. 

A simple restriction system, as introduced above, goes a long way in ad- 
dressing some of the pressing reentrancy problems of object-oriented programs. 
However, much work remains to establish the classes of reentrancy that can 
be covered with such systems and to formulate practical type systems that, at 
reasonable levels of annotational overhead, deliver static reentrancy control in 
practice. 



7 Conclusions 

We have described how the Mianjin parallel language uses a type system to 
enforce global system invariants. Namely to prevent reentrance of remote objects 
which could have unbounded buffer requirements in an asynchronous setting. 
Our experience with an implementation of Mianjin has been that such typing 
is extremely useful. It enables the development of software components which 
have an important aspect of their global system behaviour documented and 
checked as part of their type. We have also eluded to how the type system may 
be generalised to solve other problems of enforcing global system invariants by 
statically restricting certain forms of reentrancy. 
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Abstract. This paper presents ADIOS, a system for the development of 
interactively controlled metacomputations. In ADIOS, coarse grain distributed 
computations, involving several different program components, are structured 
according to a distributed Model-View-Controller architecture. Design support 
is offered via interfaces and a base class, through which the fundamental 
behavior of program components is established within a concrete framework. In 
addition, appropriate communication mechanisms are provided to back up the 
data exchange between the various components. Consisting of only few Java 
classes, ADIOS is lightweight and promotes disciplined prototyping of 
metacomputations while allowing for the integration of legacy code. 



1 Introduction 

Thanks to advances in parallel programming platforms and mathematical libraries, 
scientists are more than ever able to seamlessly exploit special-purpose hardware, 
multiprocessors and clusters of workstations to speed up individual computations. 
Now, as scientific work gradually focuses on more complex phenomena, involving 
the interplay of processes that have been studied in isolation, a new challenge arises: 
to connect several different scientific modules in ‘computational grids’, spanning 
several regions or even countries. This coarse-grained distributed computing is also 
known as metacomputing. For example, the description of an ambitious project on the 
integration of combustion and airflow programs to build an advanced simulation of a 
jet engine can be found in [12], 

Although metacomputations may run as ‘silent’ background tasks, interactive 
visualization and control can play an important role. One reason is debugging, i.e. to 
facilitate the detection of programming flaws, data exchange incompatibilities and 
performance bottlenecks. It may also be desirable to display the results of a 
computation at runtime, adjust parameters and restart the computation with different 
initial conditions to experiment with various “what if’ scenarios. Taking a more 
futuristic perspective, users could immerse themselves in artificial spaces as these are 
being calculated via powerful computational grids. Direct on-line collaboration 
between multiple users wishing to jointly control a metacomputation from distant 
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locations could also be supported. Notably, the technology for supporting interaction 
in virtual reality environments has made remarkable progress in the past years [19]. 

From a software engineering perspective, interactive metacomputing yields a 
transition from monolithic, stand-alone, command-line driven, batch-oriented code 
towards a network of remotely controllable programs. Each program is a component 
provided by a different team of scientists, and which, while performing calculations, 
can communicate and coordinate its execution with other components of the network. 
Metacomputations can also include additional components, providing control and 
visualization facilities. It is therefore important to identify the interactions among the 
various parts of a distributed computation and capture them in a pattern in the spirit of 
[10]. Providing conceptual guidance as to how this is done can be valuable for multi- 
disciplinary computing where, in practice, only a few persons involved in the 
development are likely to be computer experts. 

In this paper, we present ADIOS - A Distributed Interactive Object System - 
intended to support the development of interactive metacomputations. In ADIOS, a 
computation consists of several different components integrated within a distributed 
Model-View-Controller architecture. The rudimentary interactions between these 
components are supported by appropriate communication mechanisms. Application 
programmers are thus able to design and implement metacomputations in a 
straightforward way. The system is implemented in Java and can easily be adapted to 
work with popular component models such as Microsoft’s DCOM, OMG’s CORE A 
or JavaSoft’s JavaBeans. 

The rest of the paper is structured as follows. Section 2 gives an overview of the 
ADIOS architecture. In section 3, the process of application development in ADIOS 
is sketched. Section 4 compares with related work. Finally, section 5 concludes the 
paper and sets future directions. 



2 Architecture of ADIOS 

The ADIOS architecture, depicted in Figure 1, is based on the Model- View-Controller 
paradigm [17]. In a nutshell, a computation is divided into three main component 
types: code that performs a calculation and manipulates local data structures {Model), 
code that forwards user commands to a model to control its execution (Controller), 
and code responsible for receiving and visualizing the state of a model (View). In 
ADIOS, view and controller are combined into a single entity called Observer. Each 
computation also includes a Directory used to register the addresses of components. 
Since metacomputations are de facto distributed, these components are located on 
different machines and communicate over the network. The role and the interactions 
between these components are described in the next subsections. 



2.1 Components 

Directory. The directory is a catalog holding the addresses (locations) of the 
computation’s components. Additional properties of components, e.g. functional 
description, access control information, etc, could also be kept with the directory. 
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Each computation maintains a separate directory, hence has its own, private naming 
scope. The directory must be initialized prior to other components of a computation. 

Model. Models embody programs performing calculations, thus are intrinsic parts of 
the computation. A metacomputation typically comprises several models that run 
concurrently with each other and exchange data to coordinate their execution. Upon 
installation, models register with the directory to make their presence known to the 
system. Models are likely to be implemented by different persons residing in different 
organizations and institutions. A model may also act as a wrapper for legacy code. 

Observer: A component that is external to the computation and communicates with 
models is called an observer. Observers typically implement user interfaces for 
controlling model execution; they can also be software agents performing monitoring 
tasks or recording the execution of the computation for later inspection. An observer 
may communicate with several models. Different types of observers can dynamically 
connect to and disconnect from models, without requiring them to be re-programmed, 
re-compiled, or re-started. In fact, a model executes without knowledge of the specific 
observers linked to it. Observers locate models via the directory. 




Fig. 1. ADIOS computations are stmctured according to a distributed MVC architecture and 
may include several models that communicate with each other. Component registration and 
lookup occurs via the directory. An arbitrary number of observers can monitor execution. 

The interactions between these components are more formally captured as a set of 
protocols', supported via concrete interfaces and communication mechanisms. This 
provides developers with an accurate -and thus easy to use- model. An informal 
description of the ADIOS’ protocols is briefly given in the following. 



* Specifications of the interaction between components, rather than just a programming 
interface; this closely corresponds to the notion of a contract used in the literature. 
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2.2 Protocols 

Directory Protocol. Defines how component address (and other) information is 
registered and retrieved using the directory. This request-response communication is 
introduced via a Java RMI interface, featuring the necessary registration and lookup 
routines. ADIOS provides a default implementation, which may be easily substituted 
with any other third-party directory service. 

Computation Protocol. Describes the data exchange between models for the purpose 
of coordinating the execution of the computation. Inter-model communication is 
supported via type safe, asynchronous message passing. Messages were preferred 
over an RPC mechanism, such as Java RMI, because models typically interact with 
each other in an asynchronous peer-to-peer, rather than in a strict client-server, 
fashion. The underlying implementation is a version of the Hermes communication 
mechanism [20] adapted to the Java programming environment. To achieve logically 
consistent communication between multiple models, causal domains can be defined 
according to which messages are ordered. Preserving causality in a group of 
cooperating processes can be important in a distributed system [21] and popular 
toolkits, e.g. the ISIS system [5], also provide corresponding communication support. 

Control Protocol. Describes the means for initializing or modifying parameters of a 
model, and controlling its execution. A Java RMI interface includes the basic control 
methods of a model component. ADIOS does not introduce any multi-model control 
operations nor assumes any respective semantics. The provided primitives can be 
used to implement such functionality, if desired. Also, in case where different users 
should simultaneously control several models, their actions must be coordinated via 
distributed schemes, such as token passing, locking, or shared message boards. 

Notification Protocol: Specifies how observers are being notified from the models. A 
subscription-notification mechanism, similar to the Observer pattern [10], is used for 
this purpose. In other words, observers subscribe to and unsubscribe from a model to 
start and respectively stop receiving its notifications. The number or type of the 
observers of a model may change dynamically and does not affect model execution. 
Notification events are broadcast to observers via the ADIOS’ message passing 
mechanism. This de-couples model and observer execution, thereby eliminating 
potential performance bottlenecks and avoiding re-entrance problems that can be 
difficult to detect and handle in a distributed system. However, this allows only for 
asynchronous monitoring of models since their state may change before their 
notifications corresponding to older states reach an observer. This limitation is 
compensated by the fact that observers can receive notification events in causal order, 
thereby witnessing a logically consistent history of the entire computation. The latter 
is considered sufficient for typical observer tasks. 



^ For brevity, the message passing mechanism of ADIOS is not further discussed in this paper. 
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3 Programming in ADIOS 

The elementary behavior of model components is captured via a base class, 
implementing the interfaces through which the main control, notification and 
computation actions occur. Programmers can inherit from this class as well as extend 
its interfaces to introduce application specific functionality. The next sections 
illustrate this process in more detail. 



3.1 The Model Class 

The Model class implements the basic functionality of models. It provides two 
interfaces that define primitives for the interaction with observers and other models 
respectively, corresponding to the aforementioned protocols: 

interface ObserverDefs extends Remote { 

// control primitives 

void start (Object init_pars) ... ; 

void setBreakpoint (String label, long ticks) ... ; 

// notification primitives 

void subscribe (String name, String[] msg_types) ... ; 
void unsubscribe (String name, String[] msg_types) ... ; 
class ThreadStateMsg extends Msg { . . . } 

} 

interface ComputationDef s extends Remote { 

// empty 

} 

The ObserverDefs interface defines two control primitives. Method start initializes 
the model and begins its execution. Initialization data can be passed to the model as a 
serializable Java object; the actual object type is application dependent. Method 
setBreakpoint is used to suspend and resume execution. If n is positive, the model is 
suspended the n* time it crosses the corresponding breakpoint statement embedded in 
its code. Else, if n is zero or negative, model execution is resumed if it is indeed 
suspended within that breakpoint. 

Methods subscribe and unsubscribe are used by observers to join and respectively 
leave a model’s notification group. An observer must specify its name as well as the 
type of notification messages it wishes to receive. The base class defines a single 
notification message ThreadStateMsg carrying the state of the model’s main 
execution thread; such a message is sent to observers for each thread state change. 
Additional messages may be defined to carry application specific state information. 

The ComputationDefs interface is empty. This is because the actual messages 
exchanged between the models of a computation are a priori unknown. It is the 
application programmers who must define these messages when they design the each 
model interface as part of an entire computation. The empty interface nevertheless 
exists to underline that the communication between models should be defined via a 
separate interface, not as a part of the ObserverDefs interface. 
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The Model class also exports a few methods to be used by programmers to 
implement application specific behavior of model subclasses. The most important 
primitives are shown below: 

abstract class Model implements ... { 
abstract void run(); 

protected void breakpoint (String label); 

protected void notifyObservers ( ) ... ; 

protected MsgCenter getMsgCenter ( ) ... ; 

} 

Method run is an abstract method embodying the main execution thread of the 
model, and is invoked only once, when the model is started. This method must be 
implemented to perform the model’s calculation. Method breakpoint is to be placed 
within run in locations where it is desirable to interrupt model execution via the 
corresponding control interface call. Several breakpoints with different labels can be 
placed within the main execution thread of a model. 

Method notifyObservers is an up-call, invoked from within the base class code 
whenever the state of the main execution thread changes. The default implementation 
is to send a ThreadStateMsg notification to the observers of the model. This method 
can be overridden to send additional (state) information, if desired. Subclasses can 
send messages to a model’s observer group via the messaging interface, which is 
obtained with a call to getMsgCenter. 

3.2 Application Development - A Simple Example 

Application development in ADIOS is a well-defined stepwise process. First, the 
application-specific interactions between the various components of the computation 
must be specified. Novel control, notification, and inter-model coordination aspects 
are introduced by augmenting the existing interfaces with new methods and messages, 
refining the corresponding protocols. This design task can be completed without 
lingering on the internals of each component; legacy code considerations could, 
however, pose some restrictions as to what is practically feasible. Then, each 
component can be implemented, as a subclass of the Model class, and tested in 
isolation by different teams. In a final phase, the components are connected to each 
other to form a distributed computation. 

The development of user interface and visualization tools for a given 
metacomputation can commence as soon as the corresponding model definitions are 
complete. Due to the design of the system, an interface can be implemented even after 
the various models are developed. Also, several different analysis tools can be 
connected to a metacomputation at the same time without interfering with each other. 

As an example, suppose we were to develop a simulation of a system with two 
‘planets’ circling around a stationary ‘sun’, as depicted in Figure 2. This computation 
involves two models, calculating planet movement and exchanging positioning data 
with each other. For simplicity, we assume a common coordinate and time reference 
system between models. Also, suppose that users should be able to start the 
computation and continuously monitor the planets’ trajectories. It should also be 
possible for the users to halt the simulation, modify the velocity of either planet, and 
continue with its execution. 
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Fig. 2. Sketch of a metacomputation consisting of two loosely coupled models simulating the 
behavior of two planets in a closed-world gravitational system. 

Designing this application in the ADIOS system translates into giving three 
definitions: a control method for setting a planet’s velocity; a notification message 
that carries state information relevant to the observers of the simulation; a message 
with planet position data to be exchanged between models. These definitions can be 
expressed as extensions of the basic interfaces as follows: 



interface PlanetObserverDef s extends ObserverDefs { 

// new control primitives 

void setVelocityl float vel_x, float vel^, float vel_z) ; 

// new notification messages 

class PlanetStateMsg extends Msg { . . . } 



interface PlanetComputationDef s extends ComputationDef s { 

// new data exchange messages 
class PlanetPosMsg extends Msg { . . . } 

} 

Each model must perform the actual calculation, complying with the agreed 
specification. A sketch of a possible implementation is given below: 

class PlanetModel extends Model ... { 
void run ( ) { 

MsgCenter mc=getMsgCenter ( ) ; 
while (true) { 

me . sendObservers (new PlanetStateMsg (...)); 
me . send(other_planet , new PlanetPosMsg ( ...)); 

PlanetPosMsg m= (PlanetPosMsg)mc . receive (other_planet) ; 

// here comes the code for calculating the new position 
breakpoint ( "break" ) ; 

} 

} 
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It is important to note that the calculation of the planet’s movement (appearing as a 
comment in the code) does not need to be implemented by the PlanetModel class or 
even using Java. It could be provided in any programming language, such as C, C++, 
or FORTRAN, which can be invoked via the Java Native Interface [23]. Legacy code 
can be integrated in a computation according to the same principle, in which case the 
model becomes merely a wrapper. 



4 Related Work 

Much work has been done in metacomputing and several infrastructures have been 
developed, e.g. Legion [15], Globus [9], and WebFlowAVebVM [4], to name a few. 

A common objective of these systems is to allow collections of heterogeneous 
computing facilities to be used as a single, seamless virtual machine. As a 
consequence a lot of effort is invested to support resource discovery, binding, 
efficient resource allocation, object persistence, migration, fault tolerance, security, 
and language independence. ADIOS is a comparably lightweight system, intended for 
scientists who wish to swiftly interconnect local programs into a computational grid, 
using Java as a common ‘software bus’. Since ADIOS does not make any restrictive 
assumptions about object installation and initialization, it could be augmented with 
interactive and automatic object placement mechanisms. In fact, ADIOS could be 
introduced as a structuring/communication layer on top of such all-encompassing 
infrastructures. For example, the so-called ‘stationary agents’ introduced in [24] as 
legacy program wrappers are essentially equivalent to ADIOS’ model components. 
ADIOS could be ported on such a mobile agent platform with only a slight adaptation 
to its communication mechanisms. 

While ADIOS lacks extensive programming and runtime support, it advocates a 
development methodology adhering to a concrete pattern. The aforementioned 
systems, which aim to serve a wide range of languages and applications, leave many 
design decisions to the programmer. Further, ADIOS structures the interaction 
between the components of a computation via Java RMI and typed messages. 
Developers are thus given guidance as to how to capture the interactions between the 
various parts of an application, directly in terms of the Java programming language. 
This is in contrast to other environments such as WebFlow [4] and JavaPorts [11] 
where the programmer must establish and communicate using TCP/IP channels. 
Similarly, the Nexus communication library [7] offers an array of message passing 
interfaces including MPI, which are quite ‘low level’ requiring language-specific 
adapters and preprocessors. To achieve interoperability at the message level, even 
pure Java MPI implementations, e.g. [2], must deal with buffers (i.e. arrays of bytes) 
rather than Java objects. Notably, ADIOS does not go so far as to provide full 
distribution transparency like other systems, e.g. Java// [8] and Do! [22]. Therefore the 
parts of a computation residing on different machines must be identified at the design 
stage. This is a minor drawback for coarse grain distributed computations where the 
various components are usually known in advance. 

ADIOS enables programmers to instrument model code for the purpose of 
controlling its execution. Indeed, by combining breakpoints with notification 
messages it is possible to monitor a distributed computation in interactive step-by-step 
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mode. Similar support is offered by debuggers and monitoring systems, e.g. DARP 
[1], which usually also come with advanced code instrumentation tools (ADIOS does 
not provide such tools). Further, the ADIOS system has similarities with groupware 
and collaborative simulation systems, such as CEV [6], Clock [14], TANGOsim [3]. 
Their design also heavily relies on the MVC architecture but focuses on centralized 
applications where state is maintained within a single entity. ADIOS can handle 
several asynchronously executing entities (models) and achieves fully distributed yet 
logically consistent monitoring of such a system. Only CAVERN [18], an architecture 
with focus on collaboration in large virtual spaces, supports an extensive range of 
distributed server topologies; the pattern proposed in this paper can be viewed as a 
combination of the “Replicated Homogeneous” and the “Client-server Subgrouping” 
scheme of CAVERN. Einally, ADIOS does not come with concrete implementations 
of monitoring and visualization software. Nevertheless, due to its design, any 
monitoring/visualization kit could be connected to an ADIOS metacomputation as an 
observer. Eor the time being, observers must be lava programs. 



5 Conclusions and Future Work 

Eocusing on the main structuring and communication aspects of interactive 
distributed computations, ADIOS provides developers with both a design pattern and 
a programming framework. We believe that this dual approach simplifies the 
implementation of metacomputations considerably. Consisting merely of a few lava 
classes, ADIOS is a low-cost, portable system. No installation or maintenance of ‘yet 
another’ special-purpose runtime environment is required. Thus, ADIOS can serve as 
a rapid prototyping environment. 

As mentioned above, there are quite a few extensions that could be introduced in 
ADIOS. Recently matured Java technology, e.g. Jini/JavaSpaces and Java Enterprise 
Java Beans could also be exploited to better integrate ADIOS in the Java development 
environment. Moreover, application development in ADIOS has many parallels with 
Aspect Oriented Programming [16]. The computation protocol but more importantly 
the control and notification protocols could be viewed as aspects of a particular 
application, which must be interwoven with existing component code. Thus an 
interesting endeavor would be to port ADIOS to an aspect-enabled development 
environment, and combine it with automated wrapper generation tools, e.g. [13]. At 
least in theory, it would then be possible to automatically produce metacomputation 
code simply by selecting the appropriate legacy programs and ‘wiring’ them together 
via a GUI editor. Scripting languages based on XML could also be used to describe 
the ‘wiring diagrams’ of metacomputations in text. 

While considering these possibilities, we wish to employ the system in the 
development of complex applications in order to verify its advantages but also 
discover its limitations. Eor this reason, in collaboration with other scientists, we are 
now considering to use ADIOS in a distributed gas turbine engine simulation 
environment, described in [12]. We are confident that through this work we will be 
able to enhance our approach, making the system more attractive to scientists. 
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Abstract. This paper introduces an experimental framework for mobile agents. 
It utilizes expressiveness and formal foundation of concurrent constraint 
programming to solve the problem of system support for dynamic re-binding of 
not transferable resources and inter-agent collaboration based on logic 
variables. Proposed solutions make the agent-based programming easier and 
more straightforward and on the other hand offer a basis for a more 
sophisticated multi-agent system. The framework is implemented in the 
Distributed Oz programming language that appears to the programmer as 
concurrent, object-oriented language with data-flow synchronization. There are 
two implementations of the system - one is based on freely mobile objects and 
the other one is based on components (functors). 



1 Introduction 

The dynamically changeable, networked environment with distributed information 
and computation resources is a permanent challenge for new techniques, languages 
and paradigms supporting application design for such an environment in a 
straightforward way. Perhaps the most promising among the new paradigms is the 
mobile agent (MA) paradigm. The basic properties of mobile agents are autonomy 
and mobility [18]. MAs are autonomous because of their capability to decide what 
locations in a computer network they visit and what actions they take once being 
there. This ability is given in the source code of MA (implicit behaviour) or by the 
agent’s itinerary set dynamically (explicit order). Mobile agents can move between 
locations in a network. A location is the basic environment for execution of mobile 
agents and therefore an abstraction of the underlying computer network and operating 
system. MAs are agents because they are autonomous and they should be able to 
cooperate. 

Most of today’s MA systems offer inter-agent communication. Only a couple of 
them (e.g. Concordia[22], Gypsy[10], Mole[l]), however, offer means for agent 
coordination too. Other interesting questions to which only little attention is paid in 
the current frameworks for designing and running agent-based systems are [11]: 

• applicationlevel issues such as the ease of agent programming. 
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• control and management of agents, and 

• dynamic discovery of resources. 

In this paper, we present an experimental framework. It proposes a mechanism-level 
solution for some of the above-identified problems - especially simplifying of agent- 
based programming (by system support for re-binding of not-transferable resources) 
and agent coordination and collaboration support (using logic variables). To fulfil 
these goals the expressiveness and formal foundation of concurrent constraint 
programming offered by the Distributed Oz programming language is used. 

The paper is structured as follows: In Chapter 2 we give an overview of Distributed 
Oz and Chapter 3 identifies its key properties for mobile computing. In Chapter 4 we 
introduce our experimental framework, while separately characterizing a mobile agent 
environment, a mobile agent abstraction and a mobile agent based application. 
Chapter 5 compares our proposed system to other state of the art MA systems and 
Chapter 6 concludes the paper. 



2 Distributed Oz - Overview 

Distributed Oz [8] is a distributed programming language proposed and implemented 
at DFKI (German Research Center for Artificial Intelligence), SICS (Swedish 
Institute of Computer Science) and Universite catholique de Louvain. The current 
implementation of Oz is called Mozart and is publicly available at [4]. The foundation 
for distributed Oz is the language Oz 1, which was designed by Gert Smolka et al. at 
the Programming Systems Lab, DFKI [21] for fine-grained concurrency and implicit 
exploitation of parallelism. 

The main characteristics of Oz are concurrency, first-class procedures with lexical 
scoping enabling powerful higher-order programming, logic variables, dynamic 
typing and constraint programming. The full Oz language is defined by transforming 
all its statements into a small kernel language - see Fig. 1. 



s 


:=S S 


Sequence 


1 


X = f(h:Y, ...l„:Yn) | 

X = <number> | X = <atom> | {NewName X} 


Value 


1 


local Xi . . . X„ in S end | X = Y 


Variable 


1 


proc {X Yi ... Yn} S end | {X Yi . . . Y„} 


Procedure 


1 


{NewCell Y X} | {Exchange X Y Z} | {Access X Y} 


State 


1 


if X then S else S end 


Conditional 


1 


thread S end | {GetThreadId X} 


Thread 


1 


try S catch X then S end | raise X end 


Exception 



Fig. 1. The Oz kernel language 

The execution model of Oz consists of a shared store and a number of threads 
connected to it. Threads reduce their expressions concurrently. The store is partitioned 
into compartments for constraints, procedures and cells. As computation proceeds, the 
store accumulates constraints on variables in the constraint store [8]. 

Distributed Oz (DOz) preserves the language semantics of Oz and defines the 
distributed semantics for all Oz entities. Oz distinguishes: 
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• stateless entities - records, numbers, procedures and classes, 

• single assignment entities - logic variables, futures (read-only capability of a logic 
variable), streams, 

• stateful entities - cells (updateable mutual binding of a name and a variable), 
objects, reentrant locks, ports (asynchronous channel for many-to-one 
communication) and threads, 

• resource - entities external to the shared store (their references can be passed at 
will, but the resource can be executed only on its home site). 

The execution model of DOz differs from the execution model of Oz only by its 
partitioning on multiple sites. A site represents a part of a system, whereby each 
thread, variable, cell and port belongs exactly to one site. Each site is modeled as a 
number of threads connected to a store, but this store is accessible for any thread 
(which can be on a different site). Thus, DOz creates a so-called distributed store - 
the basic communication mechanism. An entity becomes accessible for others threads 
when it is exported. Exporting means sending out a message with an embedded 
reference to the entity. Symmetrically every thread imports an entity by receiving a 
message containing a reference. Distributed semantics defines what exactly happens 
by exporting and importing an entity for each type of entity. Briefly speaking, 
stateless entities are replicated and stateful entities can be either stationary or mobile. 
E.g. a port is stationary, so only the threads from the home site can put the message 
on the stream associated with the port itself. Other threads can only order this 
operation, which is then executed at the home site of the port. The port is identified by 
automatically generated references which are created by exporting the port. In 
contrast to it, mobile entities, such as objects, can move on remote invocation, so the 
operation is executed locally by the initiator of the operation. In distinction to 
simulated object mobility (by cloning and redirection) e.g. in Obliq [2], the DOz 
object do not consume the site resources after moving anymore and the eventual 
existing references on other sites to this object stay consistent. 

The basic synchronization mechanism is data-flow synchronization using 
variables. Variables are logic variables and represent a reference whose value is 
initially unknown. There are two basic operations on logic variables: binding and 
waiting until they are bound. Lexical scoping preserves the unique meaning of each 
variable system-wide. Besides this, objects are concurrent with explicit re-entrant 
locking and ports are asynchronous channels for many-to-one communication ordered 
inside one thread and undeterministic between different threads. 

An important feature of DOz is the support of persistent entities. Each stateless 
entity can be made persistent. That means each thread can save any rooted entity 
(entity itself and all entities reachable from it) to a file (operation Pickle. save) and 
creates in DOz terminology a picket Pickles are accessible through URLs. So an 
entity can outlive the lifetime of the program itself. 



3 Distributed Oz Support for Mobile Code Paradigm 

The Mozart implementation of the Oz programming language offers the following 
facilities to support of mobile code paradigm: 
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• Freely mobile objects — default kind of created object consisting of an object 
record, a class record containing procedures (the methods), a state pointer and a 
record containing the object’s state. If the object is made available to other sites (by 
handing over a reference to it), distributed structure of proxies and one manager at 
the home site is transparently created. The only migrating part of the object is its 
state, because the class record is transferred to a site only once. The invocation of 
an object method at a site causes this migration if the state is not available locally 
[19]. The object record and the class record cannot be changed. 

• Conversion of Oz data and byte sequence by Pickles — it is possible to save any 
stateless data structure to a file (identified by URL) that can be loaded later. 

• First-class modules - functors (syntactic support of module specifications) can be 
dynamically linked and installed by module managers. There are two types of 
functors: compiled (obtained by executing a compilation of a functor definition) 
and computed (obtained by executing compiled functors whose definitions contain 
nested functor definitions) [3]. The advantage of computed functors is that they can 
have lexical bindings to all data structures supplied to their definitions by created 
compiled functors. 

• Heterogeneity - the Mozart implementation of Oz offers compilers and emulators 
for different platforms (Linux, Solaris, Windows 9x/NT,...). 

• High-level communication - applications can offer access to their entities by 
generating and passing a ticket (string reference to a data unit inside a running 
application) to any other Oz application and so establish a high-level 
communication connection between them. 



4 System Description 

The main goal of our experimental system for support of mobile agents is to combine 
the power of high-level distributed programming support with the mobile agent 
paradigm. As was shown in the previous sections, Mozart offers simultaneously the 
advantage of a True Distributed System and the means for building a Mobile Code 
System [17]. In the following subsections the experimental framework will be 
introduced: mobile agent environment built from servers and its services, key 
properties of mobile agent as programming entity and the methodology of building an 
agent-based application in our framework. 



4.1 Mobile Agent Environment 

The basic functions offered by a mobile agent environment are transport and 
management of agents. In today’s agent systems like Aglets[14], Ara[16], 
Grasshopper[9], Gypsy[10]or Voyager[13], these services are offered by servers, 
which must be installed and running on every host computer that should be available 
for the mobile agents. Similarly our experimental framework in DOz contains two 
common modules: a MAE and MaeAdmin. Running the MAE functor on a host 
computer makes a basic environment for mobile agents available, whereby 
MaeAdmin is an optional administration tool. MAE offers the following functionality: 
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• Creation of a mobile agent with a unique identity, 

• transport of an agent, 

• sending a message to an agent (possibly on another host), 

• getting the status information about an agent, 

• system recovery after crash and 

• foreign agents acceptance in case of running as docking server. 

MAE in the current implementation of the system can be started only once per host 
computer. Therefore the unique identity of an agent can be achieved by combining the 
local name of an agent and the IP address of the host computer. Every agent created 
on a local MAE is a home agent for this MAE. On all other sites it will visit, it gains 
the status of a foreign agent. The MAE stores information about its entire home 
agents and currently available foreign agents in local database. The information about 
foreign agents is stored in the system only during time between the successful 
receiving from the previous host and successful sending an agent to the next one. The 
distributed architecture of MAE is shown in Fig. 2. 

For the programmer of mobile agent based application, the mobile agent 
environment is represented by the MAE module, which must be imported. Importing 
a MAE module causes either the launching of a new environment, recovering with 
initialization values from persistent database for home and foreign agents and already 
connected other MAE servers after crash or getting a reference to an already started 
MAE local server. The last process is realized also by resuming every incoming 
mobile agent. Thus, an agent gets access to the key services of the mobile agent 
environment. The possibility of dynamic loading and installation of first-class 
modules is thereby very important. Beside the loading and connecting of the local 
MAE module the process of resuming of an MA includes: 

• updating the agent’ s information about the current location, 

• setting appropriate task for the current location, 

• creating a new agent port for the high-level Oz communication and starting the 
service on it, 

• adding information into the database of foreign agents, 

• setting the appropriate application interface (see next subsection), 

• sending a commitment to the previous site (if all steps were successful) and 

• sending a log message to the owner of the agent (information about current 
location). 

The agent migration in the first implementation of the proposed system was based on 
freely mobile objects, whereby an agent was implemented as an object derived from a 
special class (task-based agent). By giving over a reference to the agent (that is 
automatically generated when an agent is given on an Oz port), the underlying 
distributed object migration protocol [19], [20] causes the serialization and 
transportation of the object (i.e. agent) state. If the compiled code of the agent class is 
not available at the destination host (which is unique in the whole Oz space because 
of Oz internal names), this class is transferred too. This implementation was very 
straightforward and it fully utilized advanced Distributed Oz features, but supposed 
on-line connection with agent home server during the transport from one host to 
another. 
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In the current implementation (component based) the agent is an object too, but by 
each move the agent’s class and current data state are separately saved and 
transported. This adopts the persistent procedure mechanism to enable independence 
from home server and system recovery after crash without the need for destroying all 
foreign and home agents present on the crashed location. The process of resuming an 
agent after move is in this solution extended for creating a new object of agent class 
and restoring the last data state from transported persistent data. In the next chapters 
we will consider the second implementation. 

The communication between MAEs is realized in two layers: the first layer uses 
TCP sockets for exchanging of tickets for Oz ports. Oz ports then build a second, 
high-level communication layer, which can take advantage of Oz space data 
transparency. The Oz space offers the possibility to transparent copying of stateless 
entities and creating references for worldwide unique stateful entities. These 
possibilities can be fully utilized especially by the inter-agent communication. 



4.2 Mobile Agents 

Mobile agents in our framework are objects derived from a special class MobileAgent 
that offers the basic facilities expected by the MAE from agents. There are several 
features, attributes and methods of this class. The most important are: 

• Name: attribute with the local unique name at the agent’ s home MAE, 

• Itinerary: attribute with a list of host/message pairs with hosts that should be 
visited by the mobile agent and the first message, that it will receive there, 

• AgentTicket: attribute with a ticket to an agent specific Oz port for incoming 
messages from other agents/applications. 
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• Owner, OwnerTicket: attributes with information about the address and 
communication port of the agent owner site, 

• Task: attribute with the first message sent to the agent at the current location, 

• Environment: attribute represented access to the local, application-specific 
resources after they are successfully loaded by resuming of an agent at a new 
location, 

• mae: attribute with the current local accessible mobile agent environment, 

• commoninterface: attribute that represents the path to dynamic loadable resources 
for the agent (consisting of an MAE home path and relative path from 
comminterface feature), 

• runServeStream: a method that starts to serve current agent stream connected to the 
agent communication port, 

• addlltem, removelltemL, removelltemTask: methods for the dynamic modification 
of the agents itinerary (add an item at the end, remove according to a location or 
according to a task). 



4.3 Mobile Agent (MA) Based Applications 

Creating an MA-based application in our framework is straightforward and requires 
only the following steps: 

1* Identifying all fixed, not transferable resources needed by the application (i.e. their 
type) by means of abstract names and identifying parts of the transferable agent 
state. 

2* Design and implementation of application-specific classes which are derived from 
the MobileAgent class and deal with agent state and other resources through their 
abstract names. 

3* Designing and implementing an application that creates one or more instances of 
mobile agents, specifies their itinerary, sends them away, waits until they finish 
their jobs (or until the owner stops their work), and processes the results. 

4* Design and installation of special environment modules (functors) in a compiled 
form which map the abstract sources of MA to the real local resources of the host 
computer, that should enable the execution of the MA based application. 

Now let us see the process of designing and implementation an MA-based application 
in more detail on a simple example: information search agent. Suppose we are ice- 
hockey fans and we are interested in the current rating of NHL players - especially to 
find the player with the highest score rate. 

In the first step, we identify the not transferable resources and what should be 
stored in agent state as the current result. In our case we will suppose that the visited 
host will own a database containing a table with information about NHL players. 
Lurthermore we will suppose that one row of such a table contains at least columns 
with the player’s name, the name of his club, his current score rate, assistance number 
and resulting point rate. So we will use abstract resource data, as a recordset the items 
of which will have the following fields: player, team, score, assistance and points 
corresponding to identified needed information. As a result we can store the whole 
record about the player with the current best score. 
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In the second step, we implement a class called SearchNHL (for source code See 
appendix A.l) which inherits from the MobileAgent class imported from the functor 
stored on the mobileAgent.ozf file. This class has an attribute Result, help attribute 
maxScore for storing the maximal current score rate in numerical form and three 
methods: 

• searchO searches in local database bounded to inherited attribute environment at 
field data and after simple comparison of current item with up to now highest 
score. If needed, the mobile agent updates the Result and maxScore attributes. 
After querying the whole database, the mobile agent will go to the next location 
according to its itinerary. 

• init() sets itinerary and relative path to the functor responsible for loading not 
transferable resources. 

• getResult( ) makes the result available to the outside of the agent, 

• setDataO, getData() application-specific methods for saving and restoring agent 
data state. 

In the third step we write the code (See appendix A.2) that loads modules MAE and 
SearchNHL, creates an agent with name myFirstAgent and sends it to two host 
computers: pent22.infosys.tuwien.ac.at and w5.irifosys.tuwien.ac. at. The mobile agent 
has the same job on both computers: searching the local database of NHL players and 
finding the player with the best score. After the agent returns, application presents the 
whole information about the best player together with the host’s address where the 
result came from. 

In the last step we should „install“ our distributed application. There are only two 
conditions for correct functioning of the MA-based application. The first one is 
running and accessible mobile agent environment on the destination host computer. 
The second one is availability of module for loading of not transferable resources. 
Modules are specified by functors, which can import other modules and export any 
Oz entities. So we have to write a functor, that binds abstract resource identified by 
the name data on the real data source with information about NHL players. In our 
example, for simplicity, the information is saved as an Oz persistent entity - pickle. 
So our environment functors load pickled entity from specified and for every host 
unique URL and bind it to the exported entity data. Environment functors can load 
any available system resource (e.g. Browser module for displaying log information on 
every host) too. The source code for module specification available at pent22 is listed 
in appendix A. 3 and for w5 host in appendix A.4. 



5 Comparison to Other State-of-the-Art Solutions 

Basic functions of mobile agent environments (in today’s mobile agent systems 
represented by agent servers) are identified by the Mobile Agent System 
Interoperability Lacility (MASIL) [12] and include: 

• transferring an agent, which can include initiating an agent transfer, receiving an 
agent, and transferring classes, 

• creating an agent, 
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• providing globally unique agent names, 

• supporting the concept of a region, 

• finding a mobile agent, 

• ensuring a secure environment for agent operation. 

Our experimental mobile agent environment in Distributed Oz offers all these basic 
functionalities except for the explicit support of regions (that could be added in a 
straightforward way). The security of agent execution is in the current implementation 
at basic level based on language mechanism of lexical scoping. Solving of this 
complex problem was not the main goal of our experimental framework and goes 
beyond the scope of this paper. On the other hand there are several facilities which 
overlap the functionality of current mobile agent systems. The most important two 
are: system support for dynamic re-binding of not transferable resources and 
utilization of logic variables to powerful data-synchronization and agent group 
communication. 



5.1 System Support for Dynamic Re-binding of Not Transferable Resources 

As it is pointed in [11] little attention is paid to application-level issues such as the 
ease of agent programming, control and management of agents and dynamic 
discovery of resources. The dynamic re-binding of not transferable resources at the 
MA system level is a small contribution that makes agent programming easier and 
represents a basis for dynamic discovery of system-specific resources. Simplification 
of agent programming resides on the following facts: 

• The programmer has to identify the type of used resources in the design phase of 
an MA-based application and can debug the application on local available 
resources. 

• An MA-based application does not require to be programmed in two separate parts: 
mobile agent and location like context in Aglets, location services in Ara, place in 
Gypsy or service bridges in Concordia. 

• The installation of „server side“ MA-based applications requires only to make the 
right connection between abstract date used in the interface and real data of the 
needed type. 

• Data-based design of an MA application is more stable than if it would be tightly 
connected to services offered on needed data. 

The combination of this mechanism and Distributed Oz feature that makes all stateful 
entities accessible through tickets and stateless entities through pickets builds a basis 
for dynamic discovery of system- specific resources. At the time of creating loadable 
server side modules the application one need not know which entity will be accessible 
after loading a picket or taking a ticket. 




Mobile Agents Based on Concurrent Constraint Programming 71 



5.2 Agent Group Communication and Data-Flow Synchronization Based on 
Logic Variables 

In current MA systems the communication is usually supported at the level of 
synchronous or asynchronous message passing (e.g. Aglets, Concordia or Voyager). 
The agent coordination and collaboration is in today’s system supported on a very low 
level or not at all. The exceptions are e.g. Concordia, Gypsy and Mole. Concordia 
offers the mechanisms of distributed events (selected and group-oriented) and strong 
or weak model of collaboration. By collaboration a group of CollaboratorAgents can 
share a reference to distributed AgentGroup object. AgentGroup collects results of a 
whole group and enables making a synchronous meeting for agents from its group 
where each agent can process the relevant part of the common results. Unlike the 
Concordia Gypsy offers Supervisor-Worker agents, implementing the master-slave 
pattern. The supervisor agent can be seen as a kind of container agent that carries and 
maintains the tasks. To coordinate more worker-agents, the supervisor uses a 
Constraint Manager to analyze the task constraints. System Mole supports sessions- 
oriented one-to-one communication by RMI and Messaging session objects and 
anonymous group communication based on events. Events are objects of a specific 
type containing some information. They are generated by the producers and 
transferred to the consumers which share common knowledge. Dependencies within 
agent groups are modeled using synchronization objects, which depending on input 
events, internal rules, state information and timeout intervals generate appropriate 
output events. 

Our experimental framework utilizes the logic variables of DOz to enable control 
of the agent life cycle and inter-agent collaboration. By owner we mean the MA- 
based application that launches a mobile agent. In the current state the system owner 
can use the StatusOfAgent MAE procedure to check if the agent has already finished 
its job. Every thread applying this procedure to an agent object will be blocked until 
the agent finishes its job and binds the endJob feature (feature is object member like 
an attribute, but can be bound to a value only once) to the unit value. 

Eunctionality of Concordia’s distributed events can be reached in this way too. The 
programmer can define for any kind of event a special logic variable and store it in 
the agent’s attribute. The value check of such a variable will block all checking thread 
(owner and all Oz sites, which received a reference from the owner). The computation 
will then continue immediately after the agent binds the variable at some location in 
the network to some value. Variables can be indirectly transported as stateless tickets 
and on the next location can be taken within the process of agent resuming. Dynamic 
grouping of agents or applications can be achieved by sharing the reference to a 
variable representing a specific event or result. At the system level for example one 
can add very easily another procedure GetFinalResult(), which can wait until the 
agent binds the final result (e.g. attribute Result of our example NHL MA on its last 
visited site) and save the last migration of the agent. 

Unlike Concordia’s AgentGroup the collaboration based on DOz logic variables 
does not need synchronous meet on the same location and unlike Gypsy Supervisor 
agent it does not bring the implementation overhead (heavy weight agents). Moreover 
by such group creation programmer can focus on the data to be exchanged and not on 
the structure that enables it (like e.g. in Mole). 
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6 Conclusions 

In this paper we have presented an experimental framework for mobile agents 
implemented in a concurrent constraint based language (Distributed Oz). Distributed 
Oz offers several high level mechanisms for building a full featured mobile agent 
based system, especially freely mobile objects, dynamically loadable first-class 
modules and logic variables efficiently implemented by the on-line Distributed 
Unification algorithm[6]. These mechanisms are used to support dynamic re-binding 
of not transferable resources at the system level and for group 
communication/collaboration. This solution represents a good basis for designing 
more sophisticated MA-based application using a flexible agent hierarchy. 



Acknowledgments 

This papers was written during my research stay at the Distributed System Group 
(Technical University of Vienna) and I wish to thank all my colleagues at DSG, 
especially Mehdi Jazayeri, Wolfgang Lugmayr, Engin Kirda and Thomas Gschwind. 
The stay at the DSG was supported by the Aktion Osterreich - Slowakei, founded by 
Austrian and Slovak Republics. I would like also to thank my supervisor. Prof. Milan 
Krokavec, for his help and encouragement during my PhD study. 



References 

1. Baumann, J., Hohl, F., Rothermel, K., StraBer, M.: Mole - Concepts of a Mobile Agent 
System. WWW Journal, Special issue on Applications and Techniques of Web Agents 
(1998) Vol. 1(3) 123-137 

2. Cardelli, L.: A Language with Distributed Scope. Computing Systems. (1995) 8(1) 27-59 

3. Duchier, D., Komstaedt, L., Schulte, Ch., Smolka, G.: A Higher-order Module discipline 
with separate Compilation, Dynamic Linking, and Pickling. Technical Report, 
Programming Systems Lab, DFKI and Universitat des Saarlandes (1998) 

4. The Mozart Programming System, Deutsches Forschungszentrum fiir Kunstliche 
Intelligenz GmbH, Universitat des Saarlandes. Swedish Institute of Computer Science, 
Universite catholique de Louvain, http://www.mozart-oz.org (1999) 

5. Haridi, S., Franzen, N.: Tutorial of Oz. Technical Report, in Mozart documentation, 
availale at http://www.mozart-oz.org (1999) 

6 . Haridi, S., Roy, P.V., Brand, P., Mehl, M., Scheidhauer, R., Smolka, G.: Efficient Logic 
Variables for Distributed Computing. ACM Transactions on Programming Languages and 
Systems. To appear (2000) 

7. Haridi, S., Roy, P.V., Brand, P., Schulte, Ch.: Programming Languages for distributed 
Applications. New Generation Computing (1998) Vol. 16(3) 223-261 

8. Haridi, S., Roy, P.V., Smolka, G.: An Overview of the Design of Distributed Oz. In: 
Proceedings of the 2nd Inti. Symposium on Parallel Symbolic Computation (PASCO '97). 
Maui, Hawaii, USA, July 1997. ACM Press, New York (1997) 176-187 

9. IKV-l-l- GmbH Informations- und Kommunikationssysteme: Grasshopper The Agent 
Platform - Technical Overview (1999) 




Mobile Agents Based on Concurrent Constraint Programming 73 



10. Jazayeri, M., Lugmayr, W.: Gypsy: A Component-based Mobile Agent System. Accepted 
for 8th Euromicro Workshop on Parallel and Distributed Processing (PDP2000). Rhodos, 
Greece, 19.-21. January (2000) 

11. Kamik, N.M., Tripathi, A.R.: Design Issues in Mobile-Agent Programming Systems. 
IEEE Concurrency (July-September 1998) 52-61 

12. Milojcic, D., Breugst, M., Busse, I., Campbell, I., Covaci, S., Friedman, B., Kosaka, K., 
Lange, D., Ono, K., Oshima, M., Tham, C., Virdhagriswaran, S., White, I.: MASIF: The 
OMG Mobile Agent System Interoperability Facility. In: Proceedings of the Second 
International Workshop, Mobile Agents ’98. Springer- Verlag (1998) 

13. ObjectSpace, Inc.: ObjectSpace Voyager: ORB 3.0 Developer Guide (1999) 

14. Oshima, M., Karjoth, G., Ono, K.: Aglets Specification (1.1). IBM Corporation (1998) 

15. Paralic, M., Krokavec, M.: New Trends in Distributed Programming. In: Proceedins of 
IEEC’97, Oradea, Romania (1997) 

16. Peine, H., Stolpmann, T.: The Architecture of Ara Platform for Mobile Agents. In: 
Proceedings of the First International Workshop on Mobile Agents, MA’97, Berlin. 
Lecture Notes in Computer Science, Vol. 1219. Springer Verlag (1997). Also published 
In: Mobility: Processes, Computers, and Agents, ed. by Milojicic, D., Doughs, F., 
Wheeler, R., Addison-Wesley and the ACM Press (1999) 474-483 

17. Picco, G.P.: Understanding, Evaluating, Formalizing, and Exploiting Code Mobility. PhD 
thesis, Dipartimento di Automatica e Informatica, Politecnico di Torino, Italy (1998) 

18. Rothermel, K., Hohl, F., Radouniklis, N.: Mobile Agent Systems: What is Missing? In: 
Proceedings of International Working Conference on Distributed Applications and 
Interoperable Systems DAIS ’97 (1997) 

19. Roy, P.V., Brand, P., Haridi, S., Collet, R.: A Lightweight Reliable Object Migration 
Protocol. Lecture Notes in Computer Science, Vol. 1686. Springer Verlag (1999) 

20. Roy, P.V., Haridi, S., Brand, P., Smolka, G., Mehl, M., Scheidhauer, R.: Mobile Objects 
in Distributed Oz. ACM Transactions on Programming Languages and Systems, Vol. 
19(5) (1997) 804-851 

21. Smolka, G.: The Oz programming model. Computer Science Today. Lecture Notes in 
Computer Science, Vol. 1000. Springer Verlag, Berlin (1995) 324-343 

22. Wong, D., Paciorek, N., Walsh, T., DiCelie, J., Young, M., Peet, B. (Mitsubishi Electric 
ITA): Concordia: An Infrastructure for Collaborating Mobile Agents. In: Proceedings of 
the First International Workshop on Mobile Agents, MA’97, Berlin. Lecture Notes in 
Computer Science, Vol. 1219 (1997) 

Appendix: 

A.l nhl.oz 

functor 

import 

Ma at /agent/mobileAgent . ozf ' 

export 

SearchNHL 

define 

class SearchNHL from Ma .mobileAgent 
attr 

maxScore 

Result 

meth init(Name Itinerary Owner OwnerTicket) 

Ma .mobileAgent , init (Name Itinerary Owner 
OwnerTicket) 
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Ma .mobileAgent . comminterface = 

' /appMA/NHL/envir . ozf ' 

maxScore <- 0 

end 

meth search () 

proc {FindMax X} 

if { String . toint X. score} > ©maxScore then 
Result <- f(player:X where : ©location) 
maxScore <- { String . toint 
©Result .player . score} 

else skip end 

end 

in 

{ForAll ©environment . data FindMax} 

{ ©mae . goAgent self} 

end 

meth getResult ( ?X) X = ©Result end 

meth getData(?X) X = f (©maxScore ©Result) end 

meth setData(X) 

maxScore <- X.l 
Result <- X.2 

end 

end 

end 

A.2 nhlApp.oz 

functor 

import 

Mae at /server/mae . ozf ' 

SearchNHL at 'nhl2.ozf' 
define 

NA Out Admin 
in 

NA = {Mae . createAgent SearchNHL . searchNHL 
myFir St Agent 

f (dest : "pent 2 2 . inf osys . tuwien . ac . at " task : 
search) f (dest : "w5 . inf osys . tuwien . ac . at " task: 
search) nil} 

Admin = {Mae .itiAE_server getAdminServerTool ( $ ) } 

{Admin message (' Created agent: '#{NA getName ( $ ) } ) } 

{Mae. goAgent NA} 

Out = {Mae . returnAgent NA} 
if Out == 'a' then 
local R in 

R = {NA getResult ($) } 

{Admin message (" \nSearch results : \n\tteam: 

" # { VirtualString . toAtom R . player . team} # 

"\n\tplayer: "# {VirtualString . toAtom R . player . player } # 

"\n\tscore: "# {VirtualString . toAtom R . player . score} # 
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" \n\tassistance : " # {VirtualString . toAtom 

R . player . assistance} # 

"\n\tpoints: "# {VirtualString . toAtom R . player .points } # 

"\n\twhere: "# {VirtualString . toAtom R .where} #" \n" ) } 

end 
else 

{Admin message (" \nNo data found.")} 

end 

end 



A.3 envir.oz (at pent22.infosys.tuwien.ac.at) 

functor 

import Pickle 
export Data 
define 

Data = {Pickle. load 

' c : \ \users\ \paralicm\ \mas\ \appMA\ \nhl\ \Db2 . src ' } 
end 

A.4 envir.oz (at w5. infosys.tuwien.ac.at) 

functor 

import Pickle 
export Data 
define 

Data = {Pickle. load 

' /home/ studs/paralicm/mas/appMA/NHL/Dbl . src ' } 
end 
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and the operating system), the closed-world assumption is nevertheless a sufficiently 
useful approximation to enable the production of successful software. Shifting the 
emphasis from the production of the one deliverable to the production of systems that 
are composed out of components has an almost traumatic consequence. A lot of what 
we know about how to build software requires revision and sometimes radical 
departure from the established past. This talk spans much of the spectrum from why 
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approaches are now emerging to make all this possible. 
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Abstract. A namespace is a mapping from labels to values. Most programming languages 
support different forms of namespaces, such as records, dictionaries, objects, environments, 
packages and even keyword-based parameters. Typically only a few of these notions are 
first-class, leading to arbitrary restrictions and limited abstraction power in the host 
language. Piccola is a small language that unifies various notions of namespaces as first-class 
forms, or extensible, immutable records. By making namespaces explicit, Piccola is easily 
able to express various abstractions that would normally require more heavyweight 
techniques, such as language extensions or meta-programming. 



1 Introduction 

Virtually all programming languages support various notions of namespaces, or sets 
of bindings of labels to values. These include: 

• Interface. Objects have a set of named methods. 

• Scopes. Identifiers are bound in the enclosing static or dynamic scope. 

• Package. A package provides a set of named services or components. 

• Keyword-based parameters. Arguments to services are bound by 
keywords instead of position. 

Typically, however, these notions are supported in different ways by a language, and 
each carries its own restrictions. This leads to a number of problems like inflexible 
namespaces, frozen scoping rules, and limited abstraction. 

Inflexible Namespaces 

An inflexible namespace can lead to name clashes. In open systems where 
components may be added or replaced at runtime, name clashes between components 
from different applications, domains, or vendors can cause system failures. The 
following lists symptoms that are due to inflexible namespaces: 

• Flat namespaces. In older versions of Smalltalk, all classes must have 
unique names. To avoid name clashes, developers must follow naming 
conventions. Smalltalk Agent was one of the first Smalltalk 
implementations that provided namespaces. Now, most Smalltalk systems 
support namespaces. Similarly, classic C-H- has one static namespace. 
Standard C-H- [6] introduces namespaces as an additional language 
feature. 
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• Fixed namespaces. In Java, each package has its own namespace for 
classes. Packages are nested, hut inflexible. Two frameworks which — by 
chance — use the same package names, cannot be merged. The “solution” 
is to propose (internet wide) unique package naming conventions. 

• Restricted Scoping. Python [14] has only three kinds of namespaces: one 
for global objects, one for class scope, and one for local block 
invocations. Although functions are first class values in Python, nested 
functions do not have closures. However, closures can be simulated by 
specifying values as default arguments. 

• Static Services. Normally the run time environment (or the operating 
system) provides some static services. These services include printing to 
the console or accessing the local disk. These services normally operate 
within an implicit context. For example, the context defines where 
standard output should go (to the console or to a file), or the GUI context 
contains the look and feel of the user interface. It is in general notpossible 
to adjust this context only for certain parts of an application. For instance, 
a developer might wish to redirect output of some threads to the console, 
while other threads may output to the null device. 

Frozen Scoping Rules 

Most modern languages use static scoping. Identifiers are visible within the block 
where they are declared and may also be visible in blocks that are statically (i.e. 
textually) nested within that block. Identifiers in the scope of a module or package 
can be exported to be used in other modules. 

In contrast to these statically scoped languages there exist languages with 
dynamic scoping, like Postscript. Identifiers are looked up following the call stack. 
Dynamic scoped language are often considered less safe to use and require more care 
to program in. However, there are also abstractions implemented in dynamically 
scoped languages that are hard or clumsy to implement in statically scoped languages. 
Such abstractions include properties that do not align with the functional structure and 
cannot be localized in modular units. Examples include failure handling, 
synchronization and coordination. 

Limited Abstraction 

The fact that namespaces are not available at runtime limits arbitrarily the expressive 
power of abstractions. A typical symptom of limited abstraction is programmers 
having to write a lot of boilerplate code. Examples of desirable abstractions include: 

• A generic synchronization wrapper that wraps all the methods of an 
object to run in mutually exclusive mode. The inability to abstract over all 
methods of an object (in Java, for example) forces us to define a subclass 
that overrides each method to include the same synchronization code. 

• An abstraction to generate proxies. A common use of proxies is to make 
distribution transparent. The proxy has the same interface as the original 
server object, but delegates all calls to the server object over the network. 
The proxy has similar code for all methods: it transfers arguments over 
the network, invokes the remote service, and waits for the result. Eor 
instance in Java RMI, the tool rmic automatically creates RMI proxies 
for remote objects out of their object code. But it is not possible to 
program the functionality of this tool directly in Java, without reading and 
writing Java bytecode. The reflection support in java, lang . ref lect 
only allows one to inspect code, but not to change it. 
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We address these problems by unifying namespaces as forms. In section 2 we briefly 
present Piccola, a small language that introduces explicit namespaces as forms. We il- 
lustrate how forms overcome the problems we have listed. In section 3 we present 
two applications of dynamic namespaces that demonstrate how the uniform treatment 
of explicit namespaces allows simple abstractions to be implemented in Piccola that 
would require more heavyweight approaches such as metaprogramming or compiler 
extensions in other languages. Finally, the last two sections present related and future 
work. 

2 Piccola 

Piccola is designed to be a general purpose “composition language” [1][2]. That is, it 
is designed as a language for composing software components which may be written 
in a separate implementation language. Piccola’ s job is to express how components 
are configured, and to provide the connectors, coordination abstractions and glue 
abstractions needed to configure components. As such, the problems listed in the 
introduction are especially important for Piccola. We tackle these problems by 
unifying all related notions of namespaces as forms (immutable, extensible records): 

Everything is a Form 

Namespaces, contexts, interfaces, parameters, abstractions, scripts and objects are 
all modelled as forms. This unification leads to an extremely simple language, and 
allows us to abstract uniformly over all these related concepts. 

Static and Dynamic Namespaces 

Both client and server contexts are explicitly named, giving abstractions a fine 
degree of control over both static and dynamic scoping. 

Explicit Namespaces 

Namespaces can be explicitly manipulated and composed, making it quite a 
simple matter to combine, rename and compose packages or modules. 

Keyword-Based Parameters 

Abstractions are monadic, always taking a single form as a parameter, and 
returning a form (which possibly encapsulates an abstraction). First class arguments 
extend the expressiveness of abstractions. 

2.1 Separation of Concerns 

Structure in Piccola is modelled by forms. State is modelled by channels, which are 
used to store forms. Behaviour is modelled by agents, which communicate by sending 
and receiving forms through shared channels. Abstraction is provided by services, 
which are implemented by agents and channels. 

Forms 

Forms are finite mappings from labels (identifiers) to values. Forms are immutable. 
The primitive operators on forms are extension, projection, and iteration over the 
labels of a form. Form extension concatenates a form with either a single binding or 
another form, yielding a new form as a result. Projection looks up a value bound by a 
label in a form. Iteration over a form returns the set of defined labels in a form. (Sets 
are objects, which are encoded as forms.) 

A form in Piccola is defined by a script, which is a sequence of bindings and 
form-expressions. Form-expressions are structured using parentheses or indentation, 
and separated using commas or newlines, in the style of Python. The comma or 
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newline stands for the extension operator. Bindings declare either nested forms or 
service definitions. The empty form is written as ( ) . For example: 
aForm = 

aSubForm = () # a nested form 

aService(X): X # service definition 

r( count =3) # form expression 

The form aForm contains the labels aSubForm, aService, and all the labels that are 
returned by invoking the service r. If r ( ) returns a form with label aSubForm or 
aService, these bindings will hide the bindings that precede the invocation. The 
service r is invoked with the argument form count = 3. 

Channels 

State is represented by channels. Channels have the semantics of locations in the 
asynchronous p-calculus [16]. Using channels, we can model blackboards, locks, 
reference cells etc. The semantics of Piccola is given in terms of the pL-calculus [13], 
a variant of the p-calculus in which agents communicate forms instead of tuples. 

Agents 

Agents implement the behaviour of a Piccola program. Agents communicate 
along channels and exchange forms. Unlike forms, agents and channels do not appear 
in the syntax of Piccola, but they can be directly instantiated, if necessary, by means 
of the predefined services run and newChannel. 

Services 

A service represents a function or procedure. It is represented by a replicated 
agent that reads from a channel (the service location) and evaluates a form as its 
result. The service-protocol specifies how the result channel gets passed from the 
caller to the callee [15][21]. Piccola has only four keywords, two of which are needed 
to define services. The value returned by a service may be denoted by return. A 
recursively-defined service must be declared withdef, which constructs a fixpoint. 

2.2 Static and Dynamic Namespaces 

Piccola is statically scoped, and the static context of an agent is always explicitly 
accessible as a form called root. The dynamic namespace of a calling agent, 
however, is also available to the service invoked as a form called dynamic, (root 
and dynamic are the other two keywords of Piccola.) 

Labels used in a script are normally looked up in the root form, and bindings 
will extend the root form. For example, this binding defines a service 
newDocument: 

newDocument (X) : wrap (newBasicDocument (X) ) 

Agents evaluating form expressions textually below this binding have the identifier 
newDocument in their root form. More explicitly, we could also extend the root 
form to include the definition of the service newDocument: 

(1) root = 

(2) root 

(3) newDocument (X) : wrap (newBasicDocument (X) ) 

This statement is read as follows: Replace the root form with a new form (Line 1). 
The new form is indented. It is the current root form (Line 2) extended with the 
service newDocument (Line 3). 

Lookup of identifiers is done in the root form. Therefore, the body of the new- 
Document service is equivalent to: 

root .wrap (root .newBasicDocument (root .X) ) 
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This more clumsy notation stresses the fact that these identifiers are looked up in the 
root namespace of the agent implementing the body of the service. Note that the argu- 
ment label X is only defined in the root form of the service body. 

The static scoping offered by these conventions is fine for most purposes, but 
some kinds of abstractions can only be conveniently implemented with the help of 
dynamic scoping. The dynamic namespace of an agent contains whatever is 
explicitly put there, and is passed automatically whenever the agent invokes a service. 
The following myPrintln service includes the current user in its output: 

myPrintln (Text) : println (dynamic . user + ” : ” + Text) 

A caller of this service may change its dynamic namespace to include the current 
user: 

dynamic = (user = "John") # change dynamic 

myPrintln ( "Hello" ) # invoke service 

Note that the dynamic namespace does not break encapsulation. Values that are 
not put into this form remain local. The dynamic namespace is useful for passing 
implicit information between agents, but it should not be misused as an alternative to 
explicit passing of parameters. 

2.3 Explicit, First-Class Namespaces 

The possibility to explicitly read and assign the root namespace enables us to 
directly support the various importing facilities found in other languages, like the 
import package statement of Java or the from package import facility of Python. The 
service load ( ) locates a file containing a script, evaluates it, and returns the form 
defined by the script. Assume we have a script "hello . plcl " with the contents: 

# File: hello .picl 

Info: println ( "This Is the hello script") 

The script defines a form with a service bound by Info. We can now: 

• import all the bindings of the hello script and extend our root with them: 
root = (root, load ( "hello ") ) 

Info ( ) # invoke it 

This is equivalent to importing all names from a given module. If the 
service Info is already defined, it will be overridden. 

• import all the bindings but keep them in a separate nested form 
helloFlle. This prevents our root namespace from getting cluttered 
up: 

helloFlle = load ( "hello " ) 

helloFlle . Info ( ) # and use it 

• import only the Info service under a different name: 
helloinfo = load ( "hello "). Inf o 

hellolnfo() # and use the service 

The reader should note that these mechanisms can be combined. For example we can 
import a module, store it under a new name and rename selected services within. By 
using first-class forms to represent packages, language- specific import statements or 
namespace qualifiers become superfluous. We thereby overcome the problems related 
to rigid namespaces mentioned in the introduction. 
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2.4 First-Class Arguments 

Services in Piccola are monadic, taking a single form as a parameter. Keyword based 
arguments are transferred as nested forms. Since arguments are forms, form extension 
allows us to easily model default arguments. For instance, the following generic 
wrapper adds pre- and post- services to a given service: 

myDefaults = # a form with two (empty) services 

pre: () 

post: () 

wrap (X) (Args ) : 

(myDefaults, X).pre() invoke pre ( ) in X or 
Defaul ts 

res = X. service (Args ) # invoke main service 

(myDefaults, X).post() 

return res 

The service wrap is curried. It first expects a form X with three labels: pre, 
service, and post. Invoking the service s = wrap ( . . ) with a form Args calls 
pre ( ) , then invokes the service with the passed Args form and finally calls post ( ) . 
Observe how the pre and post service have a default. We prefix the argument form 
X with default bindings encapsulated in the form myDefaults. The projection 
(myDefaults, X) .pre will extract the service bound to pre in x, if it exists. 
Otherwise the default service defined in myDefaults will be used. 

3 Dynamic Abstractions 

This section will outline two applications using dynamic namespaces that typically 
could not be implemented without either language extensions or meta-programming. 
The first example implements an exception handling mechanism as a library 
abstraction in Piccola, using dynamic namespaces to pass the exception handler to the 
context in which exceptions are raised. 

The second example implements an ownership abstraction, realised as a wrapper 
for arbitrary forms and an evaluation context that may own certain objects. Only the 
owner can execute services of the wrapped objects. This is an example which is not 
commonly found as language construct. We conclude the section with some 
recommendations for disciplined use of the dynamic namespace. 

3.1 Exceptions 

An exception is raised during program execution as a reaction to some erroneous situ- 
ation. The part of the program that detects the erroneous situation cannot handle it. In- 
stead, it signals this situation and terminates execution. We say the program raises an 
exception. An exception handler, which was installed at an earlier point during 
program execution, catches the error and handles the exception, i.e. brings the system 
back to a consistent state. 

The problem is how to transmit the flow of control from the place where the 
exception is detected to the appropriate handler. A simple approach would be to 
define some global exception-holding variables. After invocation of a service, the 
client is obliged to check this error state and handle it if appropriated. This solution is 
clumsy since each function call must be followed by an error check. It also does not 
work in a concurrent system, since all processes would share the same error slot. 
Another possibility is to extend the returned value to contain a flag that indicates 
whether the returned value is valid or an error occurred during its computation. This 
approach requires that we adapt all return values to reflect the change. Furthermore it 
assumes that all services have a reply, which, for example, may not be necessary for 
distributed notifications. 
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Our solution is to use the dynamic namespace to transmit the exception from the 
raising point to the appropriate handler. The exception handler is set as follows: 
try 

do : . . . # use exception handler 

catch (E) : . . . # handle an exception 

The service try takes a form containing two services. The first is the do : service. Its 
body represents the scope of the exception handler. The handler itself is specified as a 
service catch (E) where E is the formal exception value. Whenever an exception oc- 
curs during the execution of the do : service, this handler is invoked instead of the 
normal continuation. Here is the implementation of the try and raise services: 

(1) raise(E): dynamic . raise (E) # delegate to dynamic raise 

(2) try (block) : 

(3) exception = newChannel ( ) 

(4) return OrJoin # start agents left and right 

(5) left: 

(6) block . catch ( exception . receive ( ) ) 

(7) right : 

(8) raise(e): # local raise abstraction 

(9) exception.send(e) 

(10) stopO 

(11) dynamic = (dynamic, raise = raise) 

(12) return block, do () 

The OrJoin service (Line 4) takes two services (left and right) and executes them 
concurrently. It returns the result of whatever service first terminates. Consider first 
the scenario in which a block is executed that does not lead to an exception: 

1. Two agents passed to OrJoin are started. The left agent has no impact as 
it is blocked on the local exception channel. This agent finally gets 
garbage collected, since no one ever will write to the exception channel. 

2. The right agent runs block . do ( ) (Line 12). 

3. OrJoin receives the result of the right agent and returns this as the result 
of the try statement. 

Next, consider the case where the block raises an exception: 

1 . The two agents are started. The left agent waits on the exception channel. 

2. The right agent runs the block . do ( ) (Line 12). 

3. To raise the exception in the do ( ) block, the global raise ( . . ) (defined 
on Line 1) is invoked. 

4. The global raised delegates the exception to dynamic . raise ( ) 
which is the local raise abstraction (Line 8). 

5. The local raise sends the exception value along the exception channel 
(Line 9) and silently halts using the stop() service. This means that 
OrJoin will not see this service terminating. 

6. The left agent is the only one to continue, fetching the exception value E, 
invoking catch (E) and returning (Line 6). 
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The raise service can be considered as an implicit additional argument passed 
during invocation. This resembles the idiom used when programming with 
exceptions. The signature of a service that may raise an exception looks like 
aService ( . . . , ExceptionHandler e) . Compared to this approach, the explicit 
dynamic namespace has several advantages. First, it supports the separation of 
functional aspect from the error handling aspect. It seems more appropriate to directly 
relate the formal argument of a method to its functional aspect, instead of blurring it 
up with contextual arguments. It makes code more readable (thus maintainable) when 
unnecessary parameters are not visible. Imagine a function which does not raise an 
exception itself, but is required to pass the handler down to all services it uses. 
Finally, dynamic namespaces allow the programmer to introduce an exception 
handler later in the project development without rewriting code that neither handles 
exceptions nor detects erroneous situations. 

Observe that the exception abstraction cannot be implemented as a simple 
wrapper that adds some pre- and post execution code. The reason is that raise must 
be accessible from anywhere within the executed block. 

3.2 Ownership 

In our second example we consider ownership of objects. An ownable object belongs 
to at most one owner. Only the owner can invoke services of the owned object. An 
ownable object can be fetched by an owner, which then has privileged access to it. 
The owner may release or transfer ownership. A notion of ownership can be used in 
various areas: for example synchronization for owned objects can be managed by the 
owner, or the owner can take over garbage collection issues on the owned object. 
Ownership can guarantee alias free references [17]. 

To translate an ordinary object into an ownable object, we do the following: 

• Add an instance variable to store an owner. 

• Add methods to fetch, remove, and transmit ownership. Of course, fetch 
will only work when the object is not owned for the moment. Remove 
and transmit are only possible, if the caller owns the object in question. 

• Modify each method such that it expects an owner as additional 
argument. The precondition of the method is strengthened, as it is 
necessary that the passed owner be the owner of the object. Only when 
the passed owner owns the object can the method be performed, 
otherwise an exception is raised. 

• All calls to the object methods must reflect the change and also include 
the owner. 

Using explicit namespaces, it is possible to (1) build a generic abstraction wrap- 
Ownable (Form) that wraps all services of the form to check for ownership, and (2) 
to build an evaluator runAsOwner (Block) that runs a block of code with an owner. 
Assume we have object factories to create an owner, and an ownable: 
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newOwner : 

owns (Ownable ) : ...# do we own the ownable? 
add (Ownable) : ... # add the ownable 
remove (Ownable ) : .# remove the ownable 
loseAll: ... # remove all ownables we have 

newOwnable : 

addTo (Owner) : ... 

release : ... 

Given an instance o of ownable, then o . addTo (Owner) stores the owner, provided 
o is not already owned, and notifies the owner using Owner . add ( o ) . 

Evaluating a block within the context of an owner is now written as: 
runAsOwner (block) : 

# create new Owner 

dynamic = (dynamic, currentOwner = newOwner () ) 

block.doO # evaluate Block 

dynamic . currentOwner . loseAll ( ) # drop all owned 

This runs the block within a dynamic namespace with an associated (initially empty) 
owner. Finally, the generic wrapper that makes a form into an ownable form is: 

(1) getCurrentOwner : dynamic . currentOwner 

(2) wrapOwnable (Form) : 

(3) ownable = newOwnable ( ) # delegate 

(4) newForm = wrapAllLambda # adapt all services 

(5) form = 

(6) Form 

(7) releaseThisForm: ownable . release ( ) 

(8) map (service) (Args) : 

(9) if (getCurrentOwner (). owns (ownable) ) 

(10) then: service (Args J # invoke service 

(11) else: raise (NotOwnerException) 

(12) return 

(13) newForm 

(14) ownThisForm: ownable . addTo (getCurrentOwner () ) 
The wrapper needs some explanation. Line 3 creates the ownable object as a delegate. 
Then all services of the wrapped form Form, extended with releaseThisForm are 
modified by a map function. The new function (Line 9-11) checks if the current (dy- 
namic) owner owns this ownable object. If so it invokes the original service with the 
given arguments. Otherwise an exception is raised, signalling that the caller does not 
own the object. The library service wrapAllLambda uses form-iteration to get the set 
of defined labels (i.e. the exported services) of form. 

Note that we include the additional service releaseThisForm (Line 7) into the 
map to ensure that only the current owner may release it. (Transfer of ownership is 
omitted in the code). We return the wrapped form (Line 13) extended with the service 
to acquire ownership (Line 14). 
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3.3 Observations 

We can draw the following lessons from the two examples: 

• Each feature requires a label in the dynamic namespace. Exceptions use 
raise, and ownership uses currentOwner to store the context sensitive 
information. We assume that these bindings do not conflict with other 
usages of the dynamic namespace. 

• The users of the contextual abstractions do not need to access dynamic 
themselves. Instead it is better to provide static abstractions that access 
the context sensitive information, e.g. getCurrentOwner ( ) in the 
second example. 

• Contextual abstractions are used in pairs: Outside is an abstraction (e.g. 
try) that executes a piece of code (the do block) within a extended 
context. Within this block are clients of the contextual abstraction that 
invoke the service (e.g. raise) provided by the surrounding context. 
Using the contextual service not within the established context is a type 
error: it results in looking up a label in dynamic which is not bound. 

4 Related and Future Work 

Objects and many different variants of inheritance (e.g. Smalltalk-style vs. Beta-style 
inheritance [3]) can also be modelled as applications of forms as explicit namespaces 
[23]. In effect self is represented as a form containing the object’s methods. 
Subclassing corresponds to extending the form representing self. A form is 
conceptually simpler than an object, since it lacks a notion of inheritance. Eor 
instance, in Self [25] objects have a parent link providing inheritance by means of 
delegation. Therefore, in Self delegation is built into the language, whereas we 
implement it using the forms. 

Many scripting languages provide access to the environment by representing it as 
a dictionary. Python [14] has built-in functions to return its namespaces as 
dictionaries to enable introspection. Modification of these dictionaries, however, is 
undefined. A dictionary gives the programmer much more freedom than is presently 
possible with forms. In particular, labels of forms in Piccola are not first class values, 
whereas dictionaries for environments often use strings as keys. 

Forms can also be compared to Odersky’s variable functions [18]. Variable func- 
tions are mappings between sets of arbitrary values (not just from labels to forms), 
and can be updated to model state changes. 

Namespaces play an emerging role in middleware: For instance the Corba naming 
service [19] uses nested namespaces to identify distributed objects. 

Future work is required to clarify the relation between namespaces, and security 
and authentication issues. In an open system, mobile code runs in two modes: one 
mode gives unrestricted access to local resources, while restricted access employs a 
security manager to guard access and use of local resources. In the ambient calculus 
[4] an ambient corresponds to an administrative domain. An ambient can only access 
services within its domain. An interesting question to explore is whether we can unify 
ambients and namespaces. 

Piet [20] is a language that takes the p-calculus as a core language and adds func- 
tions, assignment etc. as syntactic sugar. We used Piet for experiments in modelling 
software composition [22]. The pL-calculus is a result of these studies. It replaces 
tuple communication by form communication. Piccola is formally defined on the pL- 
calculus . It adopts the primitives of the pL-calculus (channels, and parallel 
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composition operator) but makes them available as predefined services which can be 
overridden if necessary. The form-calculus [23] extends the set of core form 
operators. The additional operators are simple label restriction and form restriction to 
remove labels, and a matching operator to check for the existence of a label. Lumpe 
has developed a type system for the pL-calculus [13], but this system cannot be 
incorporated directly into Piccola, because it lacks parametric polymorphism and 
recursive types. 

Common Lisp [24] allows the programmer to declare “special” variables to be dy- 
namically scoped. Many languages now have features incorporated into their libraries 
that allow the programmer to create and use dynamic variables. For instance in Java2, 
the class java.lang.ThreadLocal contains a different value for each thread. Pro- 
grammers use this class to store transaction identifiers or similar constructs. 

Applications using dynamic namespaces have many similarities to programming 
with monads in functional languages. Monads are used to model state in a purely 
functional world [10] [26]. The dynamic namespace builds on the notion of clients and 
providers of services. It therefore naturally extends to open, distributed systems, 
whereas monads are closely related to the lambda calculus. 

In the area of object oriented languages, there exist several proposals to better 
support separation of concerns within a program. The proposal that seems the most 
attracting is aspect-oriented programming (AOP) [8]. In AOP, aspects are explicitly 
separated from normal classes. The aspect weaver merges the aspect into the source 
code. Using AOP can greatly reduce the complexity of code [11]. 

Many of the applications possible using dynamic namespaces can also be imple- 
mented using metaobjects and message passing control [5] [7]. We consider the ap- 
proach with explicit namespaces to be much more lightweight. 

5 Conclusion 

Piccola is a small language for composing software components. It is intended to be a 
general language suitable for expressing many different styles of components and 
composition abstractions. One way it achieves this is by unifying various notions of 
namespaces present in other languages, such as environments, interfaces, objects and 
packages, and making them explicitly manipulable as “forms.” 

Explicit namespaces make it possible in Piccola to have flexible static and 
dynamic scoping, to support various module concepts, and to implement generic 
wrappers that go beyond adding pre- and post methods to services. All this flexibility 
can be achieved with a minimal set of operators over forms and does not require the 
use of meta programming facilities. 

A stable implementation of Piccola is available from the authors’ web site. Work 
is ongoing in many areas, including experimental development of compositional 
styles for various application domains, reasoning about compositional properties, 
visualization, distribution, and flexible type systems. 
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Abstract. We are concerned with the design of programming langu- 
ages that support the paradigm of component-oriented programming. 
Languages based on the accepted idea of combining modular and object- 
oriented concepts fail to provide adequate support. We argue that mes- 
sages should be separated from methods to address this shortcoming. 
We introduce the concept of stand-alone messages, give examples for its 
utility, and compare it to related approaches and language constructs. 
Besides leading to interesting insights on the interaction of modular and 
object-oriented concepts, we believe that stand-alone messages also pro- 
vide a useful basis for further research on component-oriented program- 
ming languages. 



1 Introduction 

Component-oriented programming replaces monolithic software systems with 
reusable software components and hierarchical component frameworks [30]. Com- 
ponents extend the capabilities of frameworks, while frameworks provide exe- 
cution environments for components. Both are developed by independent and 
mutually unaware vendors for late composition by third parties. Late composi- 
tion requires that component-oriented software systems support dynamic and 
independent extensibility. Dynamic extensibility enables the addition of new 
components at run-time, while independent extensibility allows components and 
frameworks from mutually unaware vendors to be composed. 

While current approaches to component-oriented programming are largely 
based on component models such as COM [16] and CORBA [23], recent research 
has focused on programming language support [1,2,6,17,24,31]. Compared to the 
implicit support provided by these models, supporting component-oriented pro- 
gramming explicitly in programming languages has two major advantages. First, 
it enables a seamless development process since analysis, design, and implemen- 
tation can use the same basic concepts to describe a software artifact. Second, 
it allows compilers to perform extensive checking and to generate efficient code. 
Direct support for component-oriented programming can thus be expected to 
lead to more maintainable, reliable, and efficient systems. 
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The minimal assumptions that frameworks and components make about each 
other are specified using interfaces. An interface is an abstraction of all possi- 
ble implementations that can fill a certain role in the composed system [13]. We 
view interfaces as sets of messages (abstract operations) and implementations as 
sets of methods (concrete operations). Messages describe what effect is achieved 
by an operation, while methods describe how that effect is achieved. Multiple 
instances of an implementation can exist concurrently. We say that an imple- 
mentation (or an instance) conforms to an interface if it provides methods for all 
messages in that interface. Interfaces are essential to component-oriented pro- 
gramming because they are the only form of coordination between component 
and framework vendors and the only means by which third parties can validate 
compositions. 

A component-oriented programming language needs constructs to express in- 
terfaces and implementations and must also support dynamic and independent 
extensibility. In programming languages, interfaces and implementations should 
be modeled as interface types and implementation types respectively. In this 
manner, we can define the conformance of an implementation to an interface by 
the conformance of the corresponding types. Dynamic extensibility requires some 
form of polymorphism that allows different instances of implementation types to 
be bound to the same interface types at run-time. Inclusion polymorphism [5] in 
object-oriented languages such as Java [11] is one way to achieve this, although 
we prefer the term implementation polymorphism in this context. Independent 
extensibility requires some form of encapsulation that isolates components from 
their environment except for explicitly declared dependencies. Sealed encapsu- 
lation constructs [3] in modular languages such as Oberon [25] are one way to 
achieve this.^ Therefore, combining concepts from modular and object-oriented 
languages should be a viable approach to the design of component-oriented pro- 
gramming languages [30]. 

However, simply embedding object-oriented concepts into a modular language 
unchanged is insufficient. If a component has to implement multiple interfaces 
defined by independent frameworks, syntactic and semantic interface conflicts 
can occur. These conflicts preclude framework combination and thus violate the 
principle of independent extensibility. To avoid these conflicts, messages must 
be given unique identities independent of the types in which they participate. 
This contradicts the object-oriented paradigm in which messages only have uni- 
que identities within a type. We propose the concept of stand-alone messages 
and discuss its ramifications for language design. In particular, we show that 
stand-alone messages simplify the integration of other desirable properties such 
as structural conformance. Separating the concepts of messages, methods, mo- 
dules, and types opens a previously unexplored region of the design space for 



^ In analogy to Cardelli [3], we call an encapsulation construct open if neither visibility 
nor membership is restricted, closed if only visibility is restricted, and sealed if 
visibility and membership are restricted. Java’s packages are closed in this sense, 
while modules in Oberon are sealed. 
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programming languages that seems well-suited for component-oriented program- 
ming. 

In the next section we illustrate the problem of interface conflicts in a Java- 
like language. In Sect. 3 we develop the concept of stand-alone messages and show 
how it resolves interface conflicts. Section 4 discusses additional applications of 
stand-alone messages, and Sect. 5 surveys related work and compares it to our 
approach. Finally, in Sect. 6 we conclude with a summary of contributions and 
an outline for future work. 



2 Interface Conflicts 

Software components often need to conform to multiple interfaces for technical 
or marketing reasons. Consider a component that presents the results of a data- 
base query within a compound document. On the technical side, instances of this 
component might have to react to notifications from the database management 
and the compound document framework to keep their presentation current. On 
the marketing side, the component might increase its potential market if it could 
be composed with different database management and compound document fra- 
meworks. To support independent extensibility, it must be possible to develop 
components that conform to multiple interfaces even if those interfaces were 
defined by mutually unaware framework vendors. 

As a simple example for the problems caused by framework combination, 
we attempt to develop a Stack component that is usable across four different 
frameworks. We assume a Java-like programming language in which (closed) 
packages have been replaced by (sealed) modules. Mapping interface types to 
interfaces and implementation types to classes is appropriate in such a language. 
The first framework defines the following interface: 

module edu.uci. framework { 
public interface Stack { 

public void push (Object o); 
public void pop (); 
public Object top (): 
public boolean empty (); 

} 

} 

The designer of this interface followed the textbook definition of the abstract 
data type Stack closely, and developing a class that implements this interface is 
straightforward. The second framework defines the following interface: 

module gov. nsa. framework { 
public interface Stack { 

public void push (Object o); // pre o / null; post top = o 

public void pop (); // pre size > 0 

public Object top (); // pre size > 0; post result / null 

public int size (); // post result > 0 

} 



/ / pre o yf null; post top = o 
/ / pre -1 empty 

// pre ^ empty; post result yf null 

/ / “no elements?" 



} 
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Instead of relying on an empty message, this vendor chose to work with the size 
of the Stack. To support this interface as well, we have to add a size method to 
our class which is again straightforward. We can even define the empty method in 
terms of the new size method to avoid some redundancy. Note that this relies on 
empty and size = 0 expressing identical semantics. The third framework defines 
the following interface: 

module com. sun. framework { 
public interface Stack { 

public void push (Object o); 
public Object pop (); 
public boolean empty (); 

} 

} 

Apart from simply removing the top element, the pop message in this interface 
also returns the top element. To support this interface as well, our class would 
have to implement two pop methods with different signatures.^ However, since 
the signatures differ only in their return types, Java’s overloading mechanism 
does not allow us to do this. We have just encountered an example of a syntactic 
conflict between two interfaces. In our Java-like language, it is not possible to 
express a class that implements this third interface in addition to the first two. 
The fourth and final framework defines the following interface: 

module org.cthuihu. framework { 
public interface Stack { 

public void push (Object o); 
public void pop (); 
public Object top (): 
public boolean empty (); 
public int size (); 

} 

} 

This interface is identical to the first interface, except for the additional size mes- 
sage. Unlike the size message in the second interface, this one returns the number 
of remaining push operations until some expensive internal restructuring occurs.^ 
To support this interface as well, our class would have to implement two different 
size methods with identical signatures. However, since the signatures are iden- 
tical, it is not possible to distinguish these messages. We have just encountered 
an example of a semantic conflict between two interfaces. In our Java-like lan- 
guage, it is not possible to express a class that implements this fourth interface 
in addition to the first two. 

As our examples have shown, embedding object-oriented concepts unchanged 
into a modular language fails to address interface conflicts caused by framework 

^ Unlike the Java language specification [11], we distingnish the name of a message 
from its signature (the list of parameter types -I- the return type). 

® This information might be necessary in a framework with real-time constraints. 
Implementations based on incrementally growing arrays can supply it quite naturally. 



// pre o / null; post top = o 
/ / pre -1 empty 

// pre ^ empty: post result / null 

/ / “no elements?" 

// “how many pushes?" 



/ / pre o 7^ null; post top = o 
// pre ^ empty: post result 7^ null 

/ / “no elements?" 
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combination. Note that Component Pascal [22], Java [11], Modula-3 [4], and 
Oberon-2 [21], which are often regarded as “close approximations” of component- 
oriented programming languages [30], follow a similar design. 



3 Stand-Alone Messages 

In the Java-like language from Sect. 2, messages are declared within interfaces, 
while methods are declared within classes. Consequently, the identity of a mes- 
sage is relative to the interface in which it is declared, whereas the identity of a 
method is relative to the class in which it is declared. In the case of methods, 
this form of identity is needed to support polymorphism. Consider the following 
example: 



edu.uci. framework. Stack stack = null; 

stack = new edu.uci. components. ArrayedStack(16); 
stack. push(new Integer(l)); 

stack = new edu. uci. components. LinkedStack(); 
stack. push(new Integer(l)); 



After we bind an instance of ArrayedStack to the reference stack, we expect the 
message push to invoke the specific push method declared for ArrayedStack. Simi- 
larly, after we rebind an instance of LinkedStack to the reference stack, we expect 
the same message push invoke a different push method declared for LinkedStack. 
Whenever the class of the instance bound to the stack reference changes, we 
want the identity of the methods invoked through that reference to change as 
well. In the case of messages, however, this form of identity is the reason for the 
interface conflicts described in Sect. 2. Since the identity of a message is only 
unique within an interface, combining two interfaces can result in two messages 
that are not unique within the combined interface anymore. 

In order to avoid interface conflicts, we must break the symmetry between 
message and methods, both of which only have a unique identity within the 
type (interface, class) in which they are declared. Since methods must keep their 
relative identity to make polymorphism work, the only option is to decouple 
the identity of messages from interfaces. If messages should not have a relative 
identity to types anymore, the only reasonable scope in which they could be 
declared is that of the module. We call messages that have a unique identity 
relative to their declaring module stand-alone messages. The following example 
suggests a syntax for stand-alone messages in our Java- like language: 



module edu. uci. framework { 

public message void push (Object o); 
public message void pop (); 
public message Object top (); 
public message boolean empty (); 
public interface Stack { push, pop, to 



// pre o null; post top = o 
// pre -1 empty 

// pre ^ empty; post result null 

// "no elements?” 
empty } 



} 
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This example shows how the first interface from Sect. 2 is declared using stand- 
alone messages. In particular, the last line of this example declares an interface 
type that consists of the four messages push, pop, top, and empty. However, note 
that this is very different from the original form of declaring an interface. In an ex- 
ternal module that imports edu.uci. framework, the type edu.uci. framework. Stack 
would actually appear as follows: 

interface edu.uci. framework. Stack { 

edu.uci. framework. push, edu.uci. framework. pop, 
edu.uci. framework. top, edu.uci. framework. empty 

} 

This implies that messages always have to be fully qualified in external modules: 



edu.uci. framework. Stack stack = null; 

stack = new edu.uci. components. ArrayedStack(16); 
stack, edu.uci. framework, push (new Integer(l)); 



To avoid excessive qualifications, we introduce an aliasing construct for im- 
port declarations as found in Oberon [25]. A class that implements the interface 
edu.uci. framework. Stack is then expressed as follows: 

module com. factorial. cool. extension { 
import fl = edu.uci. framework: 
public class CoolStack implements fl. Stack { 

public void fl.push (Object o) { . . . } 
public void fl.pop (){...} 
public Object fl.top (){...} 
public boolean fl. empty (){...} 

} 

} 

To adapt this class to support all interfaces described in Sect. 2 we must import 
the relevant modules and declare a method for each message required. Since 
messages are always fully qualified, no interface conflicts can result. Note that 
we can also add a mechanism that allows component vendors to associate a single 
method with a number of messages to avoid some redundancy, especially a large 
number of forwarding methods. 

Besides being useful in a pragmatic way, stand-alone messages also lead to 
an interesting insight regarding language design. Consider the design space for 
the identity of messages and methods in programming languages. As illustra- 
ted in Table 1, both can have identities relative to either modules or types. In 
object-oriented programming languages such as Java [11], the identities of mes- 
sages and methods are relative to types. As we have seen, this design choice does 
not support independent extensibility because of interface conflicts. In modular 
programming languages such as Oberon [25], the identities of messages and me- 
thods (procedure headers and procedure bodies) are relative to modules. While 
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Table 1. Language design space for messages and methods. 





Message G Type 


Message G Module 


Method € Type 


Object-Oriented 


Component- Oriented 


Method G Module 


? 


Modular 



this design choice supports independent extensibility, it does not support dyna- 
mic extensibility because modules lack run-time polymorpism.^ Using identities 
relative to types for messages and relative to modules for methods combines 
both of these drawbacks and also does not yield a practical design. Stand-alone 
messages, however, lead to language designs in which the identity of messages 
is relative to modules, while the identity of methods is relative to types. Thus, 
they support both dynamic and independent extensibility and open a previously 
unexplored region in the design space for programming languages. We believe 
that this region is well-suited for component-oriented programming, and that 
stand-alone messages clarify the relationship between component-oriented pro- 
gramming and modular and object-oriented concepts. 



4 Additional Applications 

We illustrate a number of additional applications for stand-alone messages, ran- 
ging from language properties to software engineering considerations. 

Interface Combination. Since the identity of stand-alone messages is relative 
to modules instead of types, languages that support stand-alone messages have 
two useful properties regarding the combination of interface types. First, any 
combination of interface types results in an interface type. Second, any combi- 
nation of interface types preserves all constituent messages. As shown in Sect. 2, 
these properties do not hold in Java [11], leading to syntactic and semantic inter- 
face conflicts respectively. C-|— I- [28] and Eiffel [15] require additional language 
mechanisms to approximate both properties (see Sect. 5). 

Structural Conformance. Conformance of an implementation type A to an inter- 
face type B can either be declared explicitly, as in Java [11], or inferred based on 
a structural property, such as A providing methods for all messages of B. Struc- 
tural conformance has a number of advantages, especially for software evolution 
[12]. More importantly, a certain degree of structural conformance is required for 
component-oriented programming [1]. However, structural conformance is often 
seen as being “weaker” than declared conformance, because it can result in “acci- 
dental” conformance relations that the programmer did not anticipate. A typical 

^ Some modular programming languages do support polymorphism at compile-time or 
link-time. In Modula-3, for example, multiple modules can export the same interface 
[4]. The decision about which implementation to use is deferred until build-time. 
Standard ML provides similar capabilities [19]. 
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example of this problem is an interface type Cowboy that includes a message draw 
and an interface type Shape that also includes a draw message, presumably with 
different semantics. In a language that supports stand-alone messages, acciden- 
tal conformance of this kind is not possible. The draw messages would be defined 
in different modules and would therefore be distinguishable. 

The use of structural conformance has been proposed before. In Modula-3 
[4] structural conformance is used by default, but reference types can be branded 
to avoid accidental conformance. However, all brands in a composed system (a 
“program” in Modula-3) must be unique, which can restrict independent exten- 
sibility by mutually unaware vendors. The compound types proposal for Java [1] 
uses declared conformance for individual interfaces and structural conformance 
for combined interfaces. Although backward compatible with Java, compound 
types add additional rules to an already complex language and do not address 
the problem of interface conflicts at all. Another proposal for Java [12] requires 
that interfaces for which structural conformance should be used must extend an 
explicit marker interface Structural. In contrast to these approaches, structural 
conformance with stand-alone messages does not require any additional language 
constructs to avoid accidental conformance. 

Minimal Signatures. An interesting application of structural conformance is that 
signatures of messages can be typed in a “minimal” way to express certain 
invariants. Consider a method that prints the top element of a Stack: 



import f = edu.uci. framework; 

// does not modify "s" 
void printTop (f.Stack s) { 

if Is.f.emptyO { print(s.f.top()); } 

} 

Instead of stating that printTop does not modify the Stack in a comment, we could 
add anonymous interfaces to our language and define its signature as follows: 

void printTop (interface {f.empty.f.top} s) 

Given this signature, only the empty and top messages could be sent to s, ensu- 
ring that printTop does not modify the stack.® While not providing a complete 
solution, this form of minimal signature specification can be used to address a 
subset of component re-entrance problems [18]. 

Design Guidelines. Stand-alone messages are also helpful as design guidelines 
during development. For example, consider designing an interface for bounded 
stacks based on the interface edu.uci. framework. Stack for unbounded stacks. The 
existing interface provides the messages push, pop, top, and empty. The only 
message not yet provided is full which indicates that no more elements can be 
pushed. This reasoning leads to the following interface: 

® This only holds if printTop can not cast the parameter to another type that exposes 
more messages. 
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module edu.uci. framework. bounded { 
import f = edu.uci. framework; 

public message boolean full (); // "no more pushes?” 

public interface Stack { full, f.push, f.pop, f.top, f.empty } 

} 

However, this interface does not capture the intended semantics accurately. Con- 
sider the precondition associated with the push message in Sect. 3. It states that 
push only fails if we pass null as a parameter, but for a bounded stack push 
should also fail if the stack is full. This insight leads to the following interface: 

module edu.uci. framework. bounded { 
import f = edu.uci. framework; 

public message void push (Object o); // pre ^ full A o / null; post f.top = o 
public message boolean full (); // "no more pushes?" 

public interface Stack { push, full, f.pop, f.top, fempty } 

} 

Focusing on messages and their semantics thus helped us to uncover an incon- 
sistency between the interfaces for bounded and unbounded stacks. While deve- 
lopers can not be forced to design semantically consistent interfaces, we believe 
that concentrating on messages facilitates this process. 

Note how introducing a new push message enables us to express the semantic 
difference between bounded and unbounded stacks. The interfaces for bounded 
and unbounded stacks do not conform to each other, which is appropriate if we 
intend to model behavioral subtyping [14]. However, both interfaces do conform 
to the interface {f.pop, f.top, f.empty} and thanks to structural conformance we 
can avoid explicitly introducing this “virtual supertype.” 



5 Discussion 

We survey component models, programming conventions, design patterns, and 
language constructs that could be used to resolve interface conflicts and compare 
them to stand-alone messages. 

Component Models. Microsoft’s COM is the component model that is most simi- 
lar to our approach [16]. Instead of assigning unique identities to messages, COM 
assigns unique identities to interface types. Instead of relying on a transparent 
naming convention for modules, COM associates an automatically generated 
globally unique identifier (GUID) with each interface type. Contrary to most 
object-oriented programming languages, COM allows an implementation type 
to conform to multiple interface types without any conflicts. Combined interface 
types can also be expressed using COM’s category mechanism. 

While we emphasize explicit programming language support and the asso- 
ciated advantages, the two approaches are equivalent as far as interface conflicts 
are concerned. In particular, we could map stand-alone messages to singleton 
COM interfaces and interface types to COM categories. 
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Programming Conventions. A variety of programming conventions can be sug- 
gested to address interface conflicts. Defining naming conventions for messages 
is one of the simplest. The message push in the interface Stack in the module 
edu.uci. framework could by convention be named edu_uci_framework_Stack_push. 
While theoretically possible, we do not believe that such a convention is ac- 
ceptable in practice. Additional mechanisms for introducing short local names 
for messages would be needed, complicating the resulting language. However, 
even if we accept this complication, we must define new conventions on how na- 
mes should be abbreviated if we are concerned about readability. More complex 
programming conventions have been suggested as well [2]. 

A general problem with programming conventions is that they are not enfor- 
cable by the compiler. This applies to programming languages based on stand- 
alone messages as well, since we rely on module names that are unique by con- 
vention. However, no form of “globally unique identity” can be achieved without 
some convention, so our goal should be to make the conventions as unintrusive 
and transparent as possible. We believe that, in light of these considerations, 
conventions for module names are a good tradeoff. 

Design Patterns. Certain design patterns can be used to resolve interface con- 
flicts [10]. In a variation of the Command pattern, “messages” are modelled as 
a hierarchy of classes containing “parameter slots,” while “message sends” are 
calls to a universal dispatch method. The dispatch method performs explicit 
run-time type-tests and calls the actual method corresponding to the dynamic 
type of the “message.” This approach relies on the compiler to generate unique 
type descriptors for each class and thus prevents any conflicts between messages. 
However, static type-checking is not possible to the desirable extent.® 

Variations of the Adapter, Bridge, and Proxy patterns can be used to map 
multiple conflicting interface types to a single implementation type. The idea is 
to insert additional forwarding classes between clients of an interface type and 
its implementation type. Messages sent to the forwarding class are routed to the 
corresponding method in the implementation. While this approach preserves 
static type-checking, it can be tedious to write the required forwarding classes 
without tool support. 

Renaming Messages. In Eiffel, features inherited from ancestor classes can be 
renamed in a descendant class to avoid name clashes [15]. In our terminology, 
an implementation type conforming to multiple interface types can explicitly 
choose new local names for conflicting messages. Note that clients still use the 
messages declared in the original interface type, but the messages are “rerouted” 
in a way similar to the Adapter design pattern described above. 

Although renaming can be used to resolve interface conflicts, the approach 
has two major drawbacks. First, renaming clutters up the name space of the 

® Interestingly, stand-alone messages were originally inspired by this design pattern 
from the Oberon system [32]. Language constructs for messages appeared in Object 
Oberon [20], the protocols extension for Oberon [7], and finally Lagoona [8,9]. 
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implementation type. We may have to invent a new name for a message that is 
less expressive than the original one, define naming conventions to keep reada- 
bility up, and repeat this “renaming excercise” whenever we want to conform to 
an additional interface type. Second, renaming must be extended to combined 
interface types in addition to implementation types. This becomes particularly 
clumsy in terms of syntax if we also want to support anonymous interface types. 

Explicit Qualification. C-|— I- supports the explicit qualification of member fun- 
ctions by classes to avoid name clashes [28]. In our terminology, message sends 
can be qualified by the implementation type in which a method should be in- 
voked. As defined in C-|— k, this mechanism does not support implementation 
polymorphism as required for component-oriented programming. 

However, we can generalize the idea of explicit qualification by allowing mes- 
sage sends to be qualified by interface types. Although this does not restrict 
polymorphism anymore, even a qualified message of the form Stack, pop is not ne- 
cessarily unique, since multiple interface types with identical names could exist. 
Therefore, qualification must be extended to include module names as well, at 
which point the mechanism becomes equivalent to stand-alone messages, except 
for the redundant interface type. 

Overloading Messages. Overloading is a form of ad-hoc polymorphism [5] sup- 
ported by a number of programming languages such as Java [11] and C-k- 1- [28]. 
In our terminology, overloading essentially encodes parts of the signature of a 
message within its name and uses contextual information available when a mes- 
sage is sent to determine which actual message is being referred to. 

Although overloading helps to avoid some interface conflicts, it has two major 
limitations. First, semantic conflicts can not be avoided by overloading since 
the semantics of a message can not be expressed by type systems in which 
type checking is decidable [26]. Second, avoiding all syntactic interface conflicts 
requires all combinations of parameter and return types to be distinct. This is 
not generally possible in the presence of subtyping and the coercions it implies. 

6 Conclusions 

In this paper, we were concerned with the design of programming languages that 
support the paradigm of component-oriented programming. The principles of 
dynamic and independent extensibility led to the idea that component-oriented 
languages can be designed by combining modular and object-oriented concepts. 
However, we found that even an idealized language designed according to this 
idea failed to support independent extensibility as soon as interface types were 
combined. The key insight to circumvent this problem was recognizing that mes- 
sages can be separated from methods. While methods must have identities that 
are relative to implementation types, messages must have identities that are 
independent of interface types. We introduced the concept of stand-alone mes- 
sages whose identities are relative to modules instead of types. We showed that 
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stand-alone messages lead to language designs that support the combination 
of interface types as required. Additional examples also illustrated the utility 
of stand-alone messages for component-oriented programming. We compared 
stand-alone messages to related approaches and language constructs, observing 
that they generally lead to simpler solutions. 

We believe that the main contribution of this work is an improved understan- 
ding of how modular and object-oriented concepts interact and how they can be 
combined to support component-oriented programming. Our insight that mes- 
sages should be separated from methods can be viewed as another step towards 
the separation of concepts subsumed by classes in traditional object-oriented 
languages. Previous results in this direction include the separation of interface 
types from implementation types [27] and the separation of modules from ty- 
pes [29], both of which are now widely accepted. We believe that the concept of 
stand-alone messages will be useful as a basis for further research on component- 
oriented programming languages. 

We plan to continue our work on language support for component-oriented 
programming. Our current focus is on formally defining the experimental pro- 
gramming language Lagoona [8,9] which is based on stand-alone messages and 
on improving its prototype compiler. Additional areas of interest include the 
integration of stand-alone messages with Java, the implementation of Lagoona 
on top of COM, techniques for efficient message dispatch, formal specifications 
in the presence of stand-alone messages, static guarantees on abstract aliasing 
and representation exposure, and declarative and constraint-based approaches 
to the consistent integration and configuration of components and frameworks. 
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Abstract. We present in this paper the preliminary design of a module system 
based on a notion of components such as they are found in COM. This module 
system is inspired from that of Standard ML, and features first-class instances of 
components, first-class interfaces, and interface-polymorphic functions, as well as 
allowing components to be both imported from the environment and exported to the 
environment using simple mechanisms. The module system automates the memory 
management of interfaces and hides the IUnknown interface and Queryinterface 
mechanisms from the programmer, favoring instead a higher-level approach to 
handling interfaces. 



1 Introduction 

Components are becoming the principal way of organizing software and distributing 
libraries on operating systems such as Windows NT. In fact, components offer a natural 
improvement over classical distribution mechanism, in the areas of versioning, licensing 
and overall robustness. Many languages are able to use such components directly and 
even dynamically. On the other hand, relatively few languages are able to directly create 
components usable from any language, aside from the major popular languages such as 
C, C++ or Java. 

Interfacing components in standard programming languages has some drawbacks 
however. Since component models typically do not map directly to the large-scale pro- 
gramming mechanisms of a language, there is a paradigm shift between code using 
external components and code using internal modular units. Similarly, the creation of 
components in standard programming languages is not transparent to the programmer. 
Specifically, converting a modular unit of the programming language into a component 
often requires a reorganization of the code, especially when the large-scale programming 
mechanisms are wildly different from the component model targeted. 

One direction currently pursued to handle the complexity and paradigm shift of using 
components in general languages is to avoid the problem altogether and focus on scripting 
languages to “glue” components together and sometimes even create components in a 
lightweight fashion, by simple composition. This approach is useful for small tasks and 
moderately simple programs, but does not scale well to large software projects where the 
full capabilities of a general language supporting large-scale programming structures is 
most useful. 

A modern general language for programming in a component-based world should 
diminish the paradigm shift required to use components versus using the language native 
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large-scale programming mechanisms. Moreover, it should be possible to reason about 
the code, by having a reasonable semantic description of the language that includes the 
interaction with components. 

We explore in this paper the design of a language to address this issue. We tackle 
the problem by specifying a language that uses a notion of components as its sole 
large-scale programming mechanism, both external components imported from the en- 
vironment and internal components written in the programming language. An internal 
component can be exported to the environment as is. The model of components on which 
the system is based is the COM model. Our reasons for this were both pragmatic and 
theoretical. Pragmatically, COM is widely used and easily accessible. Theoretically, it 
is less object-oriented than say CORBA [22], and one of our goals is to explore issues 
in component-based programming without worrying about object-oriented issues. Our 
proposed module system subsumes both the IUnknown interface and the Queryinterface 
mechanism through a higher-level mechanism based on signature matching. 

We take as our starting point the language Standard ML (SML) [20]. SML is a 
formally-defined mostly-functional language. One advantage of working with SML is 
that there is a clear stratification between the module system and the core language. For 
our purposes, this means that we can replace the existing module system with minor 
rework of the semantics of the core language. Moreover, the SML module system will 
be used as a model in our own proposal for a component-based module system. Note 
that this is not simply a matter of implementing COM in SML, using the abstraction 
mechanisms of the language. We seek to add specific module-level capabilities that 
capture general COM-style abstractions. 

This paper describes work in progress. The work is part of a general project whose 
goals are to understand components as a mean of structuring programs, at the level of 
our current understanding of module systems, and to provide appropriate support for 
components in modern programming languages. 



2 Preliminaries 

In this section, we review the details necessary to understand the module system we are 
proposing. We first describe the COM approach to component architectures, since our 
module system is intended to model it. The description is sketchy, but good introductions 
include [27,2] for COM-specific information, and [31] for general component-oriented 
information. We then describe the current module system of SML, since it provides the 
inspiration and model for our own module system. 

2.1 Components a la COM 

COM is Microsoft’s component-based technology for code reuse and library distribution 
[27,19]. COM is a binary specification, and relies on a small number of principles. The 
underlying idea of COM is that of an interface to an object, where an object is just an 
instance of a component. An interface is a view of a component. Given a COM object, it 
is possible to query the object to see if it provides the given interface. If the object indeed 
provides the interface, it returns a pointer to the interface, and through this pointer it is 




